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METHODS OF DIAGNOSIS OF LUNG CANCER, COMPOSITIONS AND METHODS 
OF SCREENING FOR MODULATORS OF LUNG CANCER 



CROSS-REFERENCES TO RELATED APPLICATIONS 

This application is related to USSN 60/284,770, filed April 18, 2001; USSN 
60/290,492, filed May 10, 2001; USSN 60/334,370, filed November 29, 2001; USSN 
60/339,245, filed November 9, 2001; USSN 60/350,666, filed November 13, 2001; and 
10 USSN 60/xxx,xxx, filed April 12, 2002 (Docket OMNI-002P); each of which is incorporated 
herein by reference in its entirety. 

FIELD OF THE INVENTION 
The invention relates to the identification of nucleic acid and protein expression 
15 profiles and nucleic acids, products, and antibodies thereto that are involved in lung cancer, 
and to the use of such expression profiles and compositions in diagnosis and therapy of lung 
cancer. The invention fijrther relates to methods for identifying and using agents and/or ^ 
targets that inhibit lung cancer or related conditions. 

20 BACKGROUND OF TEIE INVENTION 

Lung cancer is the second most commonly occuning cancer in the United States and 
is the leading cause of cancer-related death. It is estimated that there are over 160,000 new 
cases of lung cancer in the United States every year. Of those who are diagnosed with lung 
cancer, 86 percent will die within five years. Lung cancer is the most common visceral 

25 cancer in men and accounts for nearly one third of all cancer deaths in both men and women. 
In feet, lung cancer accounts for 7% of all deaths, due to any cause, in both men and women. 

Smoking is the primary cause of lung cancer, witii more than 80% of lung cancers 
resulting fix>m smoking. About 400 to 500 separate gaseous substances are present in the 
smoke of a non-filter cigarette. The most noteworOiy substances include nitrogen oxides, 

30 hydrogen cyanide, fonnaldehyde, benzene, and toluene. The particles present in cigarette 
smoke contain at least 3,500 individual compounds such as nicotine, tobacco alkaloids 
(nomicotine, anatabine, anabasine), polycyclic aromatic hydrocarbons (e.g., benzo(a)pyrene, 
B(a)P), naphthalenes, aromatic amines, phenols, and tobacco-specific nitrosamines. 
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Tobacco-specific nitrosammes are formed during tobacco curing and processing, and 
are suspected of causing lung cancer in humans. In rodent studies, regardless of the where or 
how it is applied, the tobacco-specific nitrosamine known as NNK produces lung adenomas 
and lung adenocarcinomas. The tobacco-specific nitrosamine known as NNAL also produces 
5 lung adenocarcinomas in rodents. 

Many of the chemicals found in cigarette smoke also affect the nonsmoker inhaling 
"secondhand" or sidestream smoke. Indeed, the smoke inhaled by non-smokers has a 
chemical composition similar to the smoke inhaled by smokers, but, importantly, the 
concentrations of the carcinogenic tobacco-specific nitrosamines are present in higher 
10 concentrations in second hand smoke. For this and other reasons, "passive smoking" is an 
important cause of lung cancer, causing as many as 3,000 lung cancer deaths in nonsmokers 
each year. 

In addition to smoking, other factors thought to be causes of lung cancer include on- 
the-job exposure to carcinogens such as asbestos and luranium, e3q>osure to chemical hazards 

1 5 such as radon, polycyclic aromatic hydrocarbons, chromium, nickel, and inorganic arsenic, 
genetic fitctors, and diet 

Histological classification of various lung cancers define the types of cancer &at 
begm in the lung. See, e.g., Travis, et al. (1999) Histological Tvping of Lung and Pleural 
TuTTtnurs (International Histological Classification of Tumours, No 1 . Four major cell types 

20 make up more than 88% of aU primary lung neoplasms. These are: squamous or epidermoid 
carcinoma, small cell (also called oat cell) carcinoma, adenocarcinoma, and large cell (also 
called large cell anaplastic) carcinoma. The remainder include undi£Ga:entiat6d carcinomas, 
carcinoids, bronchial gland tumors, and other rarer types. The various cell types have 
different natural histories and responses to flierapy, and, thus, a correct histologic diagnosis is 

25 the first step of effective treatment 

Small cell lung cancer (SCLC) accoimts for 18-25% of all lung cancers, and occurs 
less frequently than non-small cell lung cancers, and generally spread to distant organs more 
rapidly than non-small cell limg cancer. In general, at the time of presentation small cell lung 
cancers have ahready spread beyond the beyond the bounds where sxirgery and curative intrat 

30 can be undertaken. Hoever, if identified early enough, these cancers are often responsive to 
chemotherapy and thoracic radiation treatment 

Non-small cell lung cancers (NSCLC) are the more frequentiy occurring form of lung 
cancer. They conq)rise squamous cell carcinoma, admocarcinoma, and large cell carcinoma 
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aiid accomt for more than 75% of all limg cancers. Non-smaU cell tuniors that are localized 
at the time of presentation can sometimes be cured with surgeiy and/or radiotherapy, but 
usually are not identified until significant metastasis has occuired, which are typically not 
very responsive to surgical, chemotherapy, or radiation treatment. 
5 The screening of asymptomatic persons at high risk for lung cancer has often proven 

ineffective. In general, only 5 to 15 percrat of lung cancer patients have their disease 
detected while they are asymptomatic. Of course, early detection and treatment are critical 
factors in the fight against lung cancer. The average survival rate is 49% for those whose 
cancer is detected early, before the cancer has spread from the lung. Lung cancer ottca 

10 spreads outside of the lung, and it may have spread to the bones or brain by the time it is 
diagnosed. While the prognosis may be better for lung cancers that are detected early, 
because of the lack ofv effective curative treatments, early detection does not necessarily alter 
file total death rate from lung cancer. 

Thus, methods for diagnosis and prognosis of lung cancer and effective treatment of 

1 5 lung cancer would be desirable. Accordingly, provided herein are methods that can be used 
in diagnosis and prognosis of lung cancer. Further provided are methods that can be used to 
screen candidate fiierapeutic agents for the ability to modidate, e.g., treat, lung cancer. 
Additionally, provided herein are molecular targets and con[q)ositions for therapeutic 
intervention in lung disease and other metastatic cancers. 

20 

SUMMARY OF TBDE INVENTION 
The present invention provides nucleotide sequmces of gaies that are up- and down- 
regulated in lung cancer cells. Such genes are usefiil for diagnostic purposes, and also as 
targets for screening for ther^eutic compounds that modulate lung cancer, such as 

25 antibodies. The methods of detecting nucleic acids of the invention or their encoded proteins 
can be used for a number of purposes. Examples include early detection of lung cancers, 
monitoring and early detection of relapse following treatment of lung cancers, monitoring 
response to therapy of lung cancers, determining prognosis of lung cancers, directing therapy 
of lung cancers, selecting patients for postoperative chemotherapy or radiation th^apy, 

30 selecting therapy, determining tumor prognosis, treatmmt, or response to treatment, and early 
detection of precancerous lesions of the lung. Examples of benign or precancerous lesions 
include: atelectasis, emphysema, brochitis, chronic obstructive pulmonary disease, fibrosis, 
hyp^ensitivity pneumonitis (HP), interstitial pulmonary fibrosis (IFF), asthma, and 
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bronchiectasis. Other aspects of the invention will become apparent to the skilled artisan by 
the following description of the invention. 

In one aspect, the present invention provides a method of detecting a limg cancer- 
associated transcript in a cell jftom a patient, the method comprising contacting a biological 
5 sample from the patient with a polynucleotide that selectively hybridizes to a sequence at 
least 80% identical to a sequence as shown in Tables 1 A-16. Alternatively, the sample may 
be contacted with a specific binding reagent, e.g., antibody. 

In one embodiment, the polynucleotide selectively hybridizes to a sequence at least 
95% identical to a sequence as shown in Tables lA-16. In another embodiment, the 
10 polynucleotide comprises a sequence as shown in Tables lA-16. 

In one embodiment, the biological sample is a tissue sample, or a body fluid. In 
another embodiment, the biological sample comprises isolated nucleic acids, e.g., mRNA. 

In one embodiment, the polynucleotide is labeled, e.g., with a fluorescent label. In 
one embodiment, the polynucleotide is inomobilized on a solid surface. In one embodiment, 
15 the patient is undergoing a ther^eutic regimen to treat lung cancer. In another embodiment, 
tibie patient is suspected of having lung cancer. In one embodiment, the patient is a primate, 
e.g., a human. 

In one embodiment, the method further comprises the step of amplifying nucleic adds 
before the step of contacting the biological sample with the polynucleotide. 

20 In another aspect, the present invention provides a method of monitoring the efficacy 

of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
biological sample from a padent undergoing the ther^eutic treatment; and (ii) determining 
the level of a lung cancer-associated transcript in the biological sample by contacting the 
biological sample with a polynucleotide fliat selectively hybridizes to a sequrace at least 80% 

25 identical to a sequence as shown in Tables 1 A-16, thereby monitoring the efficacy of the 
therapy. Or the sample may be evaluated for protein, e.g., contacting the sample with an 
antibody. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated transcript to a level of the lung cancer-associated 
30 transcript in a biological sample fix)m the patient prior to, or earUer in, the then^eutic 
treatment Or the sample may be evalated for comparison of protein. 

In another aspect, the present invention provides a method of monitoring the efficacy 
of a therapeutic treatment of lung cancer, the'method comprising the steps of: (i) providing a 
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biological sample fiom a patieat undergoing the therapeutic treatment; and (ii) detemiining 
the level of a lung cancer-associated antibody in the biological sample by contacting flie 
biological sanq>l6 wifh a polypeptide encoded by a polynucleotide that selectively hybridizes 
to a sequence at least 80% identical to a sequence as shown in Tables .1 A-16» wherem the 
S polypeptide specifically binds to the lung cancer-associated antibody, thereby monitoring the 
efficacy of the therapy. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated antibody to a level of the lung cancer-associated antibody 
in a biological santiple from the patient prior to, or earlier in, the therapeutic treatment 

10 In another aspect, the present invention provides a method of monitoring the efficacy 

of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a lung cancer-associated polypeptide in the biological sample by contacting the 
biological sample with an antibody, wherein the antibody specifically binds to a polypeptide 

IS encoded by a polynucleotide that selectively hybridizes to a sequence at least 80% identical 
to a sequence as shown in Tables 1 A-16, thereby monitoring the efficacy of the therqjy. 

hi one embodiment, the method fiirther comprises the step of: (iii) comparing the 
level of the lung cancer-associated polypeptide to a level of the lung cancer-associated 
polypeptide in a biological sample from the patient prior to, or earlier in, the therapeutic 

20 treatment In one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1 A-16. In one CTibodiment, an 
expression vector or cell comprises the isolated nucleic acid hi one aspect, the present 
invration provides an isolated polypeptide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown in Tables lA-16. 

25 In another aspect, the present invention provides an antibody that specifically binds to 

an isolated polypeptide which is encoded by a nucleic acid molecule having polynucleotide 
sequence as shown in Tables 1 A-16. In one embodiment,' the antibody is conjugated to an 
effector component, e.g., a fluorescent label, a radioisotope or a cytotoxic chemical. In one 
otnbodiment, the antibody is an antibody fragment. In another embodiment, the antibody is 

30 humanized. 

In one aspect, the present invention provides a method of detecting lung cancer in a a 
patient, the method comprising contacting a biological sample from the patient with an 
antibody or protein as described herein. 
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In another aspect, the present invention provides a method of detecting antibodies 
specific to a lung cancer gene in a patient, the method conq>rising contacting a biological 
sample fiom the patient with a polypeptide encoded by a nucleic add comprises a sequence 
from Tables lA-16. 

5 In another aspect, the present invention provides a method for identifying a compound 

that modulates a lung cancer-associated polypeptide, the mefliod comprising the steps of: (i) 
contacting the compound with a lung cancer-associated polypeptide, the polypeptide encoded 
by a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 
sequence as shown in Tables 1 A-16; and (ii) determining the functional effect of the 

1 0 compound upon the polypeptide. 

In one embodiment, the functional effect is a physical effect, an enzymatic effect, or a 
chemical effect. In one embodiment, the polypeptide is expressed in a eukaryotic host cell or 
cell membrane. In another embodiment, the polypeptide is recombinant In one 
embodiment, the functional effect is determined by measuring Ugand binding to the 

15 polypeptide. 

In another aspect, the present invention provides a method of inhibiting proliferation 
or another critical process of a lung cancer-associated cell to treat lung cancer in a pati^t, the 
method comprising the step of administering to the subject a ther^eutically effective amount 
of a confound identified as described herein. In one embodiment, the conq)ound is an 
20 antibody. 

In another aspect, the present invention provides a drug screening assay comprising 
the steps of: (i) administering a test compound to a mammal having lung cancer or a cell 
isolated therefirom; (ii) comparing the level of gene egression of a polynucleotide that 
selectively hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 
25 1 A-16 in a treated cell or mammal with the level of gene expression of the polynucleotide in 
a control cell or mammal, wherein a test compound that modulates the level of expression of 
the polynucleotide is a candidate for the treatment of lung cancer. 

In one embodiment, the control is a mammal with lung cancer or a ceU therefit)m that 
has not been treated with the test compound. In another embodiment, the control is a normal 
30 cell or mammal, or a non-maUgnant lung disease. 

In another aspect, the present invmtion provides a method for treating a mammal 
having lung cancer comprising administering a compound identified by the assay described 
herein. 
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In anoflier aspect, the present invention provides a pharmaceutical conQ>osition 'for 
treating a mflmni;!] having limg cancer, the composition comprising a compound identified by 
the assay described herein and a physiologically acceptable excipient. 

5 DETAILED DESCam^TION OF THE I^A^E^m:ON 

In accordance with the objects outlined above, the present invention provides novel 
methods for diagnosis and treatment of lung disease or cancer, as well as methods for 
screening for compositions which modulate lung cancer. ^Treatment, monitoring, detection 
or modulation of lung disease or cancer'* includes treatment, monitoring, detection, or 

10 modulation of lung disease in those patimts who have lung disease (whether malignant or 
non-maKgnant, e.g., emphysema, bronchitis, or fibrosis) as well as patients with lung cancers 
in which gene e?q)ression fi-om a gene in Tables 1 A-16 is increased or decreased, indicating 
that the subject is more likely to have disease. In particular,while these targets are identified 
primarily firom lung cancer samples, these same targets are likely to be similarly found in 

IS analyses of otiier medical conditions. These other conditions may result fix>m similar 
pathological processes which afTect similar tissues, e.g., lung cancer, small cell lung 
carcinoma (oat cell carcinoma), non-small cell carcinomas (e.g., squamous cell carcinoma, 
adenocarcinoma, large cell lung carcinoma, carcinoid, granulomatous), fibrosis (idiopathic 
pulmonary fibrosis (IFF), hypersensitivity pneumonitis (HP), interstitial pneumonitis, 

20 nonspecific idiopathic pneiunonitis (NSIP)), chronic obstmctive pulmonary disease (COPD, 
e.g., emphysema, chronic bronchitis), asthma, bronchiectasis, and esophageal cancer. See, 
e.g., the Na webpage and USSN 60/347,349 and USSN 60/xxxpDcx (docket LFBR-OOl-lP, 
filed March 29, 2002), each of which is incorporated herein by reference. The treatment may 
be of lung cancer or related condition itseli^ or treatment of metastasis. 

25 In particular, identification of markers selectively expressed on these cancers allows 

for use of that expression in diagnostic, prognostic, or therq)eutic methods. As such, the 
invention defines various compositions, e.g., nucleic acids, polypq)tides, antibodies, and 
small molecule agonists/antagonists, which will be usefiil to selectively identify those 
markers. For example, therapeutic methods may take the form of protein therapeutics which 

30 use tiie marker expression for selective localization or modulation of fimction (for those 
markers which have a causative disease effect), for vaccines, identification of binding 
partners, or antagonism, e.g., using antisense or RNAi. The markers may be usefiil for 
molecular characterization of subsets of lung diseases, which subsets may actually require 
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very differeat treatmeats. Moreover, the markers may also be important in related diseases to 
the specific cancers, e.g., which a£Eect similar tissues in non-malignant diseases, or have 
similar mechanisms of induction/maintaiance. Metastatic processes or characteristics may 
also be targeted Diagnostic and prognostic uses are made available, e.g., to subset related 
5 but distinct diseases, or to determine treatment strategy. The detection methods may be based 
upon nucleic acid, e.g., PGR or hybridization techniques, or protein, e.g., ELISA, imaging, 
IHC, etc. The diagnosis may be qualitative or quantitative, and may detect increases or 
decreases in expression levels. 

Tables lA-16 provide unigene clustCT identification numbers for the nucleotide 

id sequence of genes that exhibit increased or decreased expression in lung cancer samples. The 
tables also provide an exemplar accession number that provides a nucleotide sequence that is 
part of the unigene cluster. In Table lA, genes marked as ^target 1" or "target T are 
particularly useful as therapeutic targets. Genes marked as 'target 3" are particularly useful 
as diagnostic markers. Genes marked as "chron'' are upregulated in chronically diseased lung 

15 (e.g., emphysema, bronchitis, fibrosis) relative to lung tumors and normal tissue. In certain 
analyses, the ratio for the "chron" category was detenmned using tiie 70th percentile of 
chronically diseases lung samples divided by the 90th percentile of normal lung samples. 
The ratio for the targets was determined using the 70fh percentile of lung tumor samples 
divided by the 90th percentile of normal lung samples. 

20 

Defimtions 

The term "lung cancer protein*' or "lung cancer polynucleotide" or "lung cancer- 
associated transcript" refers to nucleic acid and polypeptide polymorphic variants, alleles, 
mutants, and interspecies homologs that: (1) have a nucleotide sequence that has greater than 

25 about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 
92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or greater nucleotide sequence identity, 
preferably over a region of ovct a region of at least about 25, 50, 100, 200, 500, 1000, or 
more nucleotides, to a nucleotide sequence of or associated with a unigene cluster of Tables 
lA-16; (2) bind to antibodies, e.g., polyclonal antibodies, raised against an immunogen 

30 comprising an amino acid sequence encoded by a nucleotide sequence of or associated with a 
unigene cluster of Tables 1 A-16, and conservatively modified variants tihereof; (3) 
specifically hybridize under stringent hybridization conditions to a nucldc acid sequence, or 
the complement thereof of Tables lA-16 and conservatively modified variants thereof; or (4) 

8 



wo 02/086443 PCTAJS02/12476 

have an amino acid sequence that has greater than about 60% amino acid sequence identity, 
65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 
or 99% or greater amino sequence identity, preferably over a region of over a region of at 
least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid sequence 
5 encoded by a nucleotide sequence of or associated with a unigene cluster of Tables 1 A-16. A 
polynucleotide or polypeptide sequence is typically fix)m a mammal including, but not 
limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
othCT mammal. A *lung cancer polypq)tide" and a "lung canca: polynucleotide," include 
both naturally occurring or recombinant forms. 

10 A "fiill length" lung cancer protein or nucleic acid refers to a lung cancer polypeptide 

or polynucleotide sequence, or a variant thereof, that contains the elements normally 
contained in one or more naturally occurring, wild type lung cancer polynucleotide or 
polypeptide sequences. The "fiill length" may be prior to, or after, various stages of post- 
translational processing or splicing, including alternative splicing. 

15 'Biological sanq)le" as used herein is a sample ofbiological tissue or fluid that 

contains nucldc acids or polypeptides, e.g., of a lung cancer protein, polynucleotide, or 
transcript. Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

20 archival materials, blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. 
Biological samples also include e?q[>lants and primary and/or transformed cell cultures 
d^ved from patient tissues. A biological sample is typically obtained from a eukaryotic 
organism, most preferably a mammal such as a primate, e.g., chimpanzee or human; cow; 
dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or other mammal; or a bird; reptile; 

25 fish. Livestock and domestic animals are of interest. 

'^Providing a biological sample" means to obtain a biological sample for use in 
methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 

30 methods of the invention in vivo. Archival tissues or materials, having treatm^it or outcome 
histoiy, will be particularly usefiiL 

The tains "identical" or percent 'identity," in flie context of two or more nucleic 
acids or polypq>tide sequences, refer to two or more sequences or subsequences that are the 
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same or have a specified percentage of amino add residues or nucleotides that are the same 
(e.g., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 
95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aligned for maximum correspondence over a comparison window or designated region) as 
5 measured using, e.g., a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http://www.ncbi.nhn.nih.gov/BLAST/ or the hke). Such sequences are then said to 
be "substantially identical." This definition also refers to, or may be appUed to, the 
complement of a test sequence. The definition also includes sequences that have deletions 
10 and/or insertions, substitutions, and naturally occurring, e.g., polymorphic or allehc variants, 
and man-made variants. As described below, the preferred algorithms can account for gaps 
and the like. Preferably, identity exists over a region that is at least about 25 amino acids or 
nucleotides in length, or more preferably over a region that is 50-100 amino acids or 
nucleotides in length. 

15 For sequence comparison, typically one sequence acts as a reference sequrace, to 

which test sequences are compared. When using a sequence comparison algorithm, test and 
reference sequences are entered into a computer, subsequence coordinates are designated, if 
necessary, and sequence algorithm program parameters are designated. Preferably, de&ult 
program paramet^ can be used, or alternative parameters can be designated. The sequence 

20 comparison algoritimi then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "con^arison window", as used herein, includes reference to a segment of 
contiguous positions selected Srom the group consisting typically of fix>m 20 to 600, usually 
about 50 to about 200, more usually about 100 to about 150 in which a sequence may be 

25 compared to a reference sequence of the same number of contiguous positions after the two 
sequences are optimally ahgned Methods of ahgnment of sequences for comparison are 
well-known in the art. Optimal alignment of sequences for comparison can be conducted, 
e.g., by the local homology algorithm of Smith and Waterman (1981) Adv. Appl. Math. 
2:482, by the homology ahgnment algorithm of Needleman and Wunsch (1970) J. Mol. Biol 

30 48:443, by the search for similarity method of Pearson and Lipman (1988) Proc. Nat'l. Acad. 
Sci. USA 85:2444, by computerized moplementations of these algorithms (GAP, BESTFIT, 
FASTA, and TFASTA mi the Wisconsin Genetics Software Package, Genetics Computer 
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Graup» 575 Science Dr.» Madison, WI), or by manual alignment and visual inspection (see, 
e.g., Ausubel, et al. (eds. 1995 and supplements) Current Protocols in Molecular Biology , 

Preferred examples of algorithms that are suitable for detennining percent sequence 
identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, which are 
5 described in Altschul, et al. (1977) Nuc. Acids Res. 25:3389-3402 and Altschul, et al. ^1990) 
J. MoL Biol. 215:403-410. BLAST and BLAST 2.0 are used, with the parameters described 
herein, to determine percent sequence identity for the nucleic acids and proteins of the 
inventioiL Software for performing BLAST analyses is publicly available through the 
National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 

10 algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighboihood word score threshold (Altschul, et al., supra). These initial 
ndghborfaood word hits act as seeds for initiating searches to find longer HSPs containing 

15 them. The word hits are extended in botili directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 
for nucleotide sequences, ttie parameters M (reward score for a pair of matching residues; 
always >0) and NQ)enalty score for mismatching residues; always <0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 

20 hits in each direction are halted when: the cumulative alignment score falls offby the 

quantity X from its maximum achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X detOTnine the 
sensitivity and speed of flie alignment. The BLASTN program (for nucleotide sequences) 

25 uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N==-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see HCTikofif and HenikofiF(1989) Proc. Natl. Acad. Sci. USA 89:10915) ahgnments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

30 The BLAST algorithm also performs a statistical analysis of the similarity between 

two sequences (see, e.g., Karlin and Altschul (1993) Proc. Natn. Acad. Sci. USA 90:5873- 
5787). One measure of similarity provided by the BLAST algorithm is the smallest sum 
probability (P(N)), which provides an indication of the probabiUty by which a match between 
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two nucleotide or amino acid sequences would occur by chance. For ^can^le, a nucleic acid 
is considered similar to a reference sequence if the smallest sum probability in a comparison 
of the test nucleic acid to the reference nucleic acid is less than about 0.2, moie preferably 
less than about 0.01, and most preferably less than about 0.001. Log values may be negative 
5 large numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 1 10, 1 50, 170, etc. 

An indication that two nucleic acid sequences are substantially identical is that the 
polypeptide encoded by the first nucleic acid is immunologically cross reactive with the 
antibodies raised against the polypeptide encoded by the second nucleic acid. Thus, a 
polypeptide is typically substantially identical to a second polypeptide, e.g., where the two 
10 peptides differ only by conservative substitutions. Another indication that two nucleic acid 
sequences are substantially identical is that the two molecules or their complements hybridize 
to each other under stringent conditions. Yet another indication that two nucleic acid 
sequences are substantially identical is that the same primers can be used to air^lify the 
sequences. 

IS A "host ceir is a naturally occurring cell or a transformed cell that contains an 

expression vector and siq)ports the replication or egression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
prokaryotic cells such as E. colU or eukaiyotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 

20 Collection catalog or web site, www.atcc.org). 

The terms *Hsolated,'* '^purified," or ^^biologically pure" refer to material fliat is 
substantially or essentially firee from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 

25 chromatogr^hy. A protein or nucleic acid that is the predominant species present iu a 

preparation is substantially purified. In particular, an isolated nucleic acid is separated fi'om 
some open reading fi^es that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term **purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 

30 that the nucleic acid or protein is at least about 85% pure, more preferably at least 95% pure, 
and most preferably at least 99% pure. 'Turify" or '^purification" in other onbodiments 
means removing at least one contaminant or component bom the composition to ht purified. 
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la this sense, purification does not require that ttie purified compound be homogeneous, e.g., 
100% pure. 

The terms '"polypeptide/' 'peptide" and ''protein" are used mterchangeably herein to 
refer to apolymer of amino acid residues. The terms aqpply to amino acid polymers in which 
5 one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally 
occurring amino acid, as well as to naturally occurring amino acid polymers, those containing 
modified residues, and non-naturally occurring amino acid polymo". 

The term "amino acid" refers to naturally occurring and synthetic amino acids, as well 
as amino acid analogs and amino acid mimetics that function similarly to the naturally 

10 occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
carboxyglutamate, and 0-phosphoserine. Amino acid analogs refer to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a cari^oxyl group, an amino group, and an R group, e.g., homoserine, 

IS norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 
modified R groiq>s (e.g., norleucine) or modified peptide backbones, but retain some basic 
chenucal structure as a naturaUy occurring amino acid. Amino acid mimetics refer to 
chemical compounds that have a structure that is different firom the general chemical 
stmcture of an amino acid, but that fimction similarly to another amino acid. 

20 Amino acids may be referred to herein by ei&er their conomonly known three letter 

symbols or by the one-letter symbols recommended by the lUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their conunonly 
adopted single-letter codes. 

"Conservatively modified variants" ^plies to both amino acid and nucleic acid 

25 sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degraeracy of the genetic code, a large number of fimctionally identical nucleic acids encode 

30 most proteins. For instance, the codons GCA, GCC, GCG, and GCU each encode the amino 
acid alanine. Thus, at each position where an alanine is specified by a codon, the codon can 
be altCTcd to ano&er of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
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conservatively modified variations. Ev^ nucleic acid sequence Ker^ which encodes a 

polypeptide also describes silent variations of the nucleic acid In certain contexts each 

codon in a nucleic acid (except AUG, which is ordinarily tiie only codon for methionine, and 

TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a 

5 functionally similar molecule. Accordingly, a silent variation of a nucleic acid which 

encodes a polypeptide is implicit in a described sequence with respect to the expression 

product, but not necessarily with respect to actual probe sequences. 

As to amino acid sequences, one of skill will recognize that individual substitutions, 

deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which 

10 alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded 
sequence is a "conservatively modified variant" where the alteration results in the substitution 
of an amino acid with a chemically similar amino acid. Conservative substitution tables 
providing fimctionally similar amino acids are well known in the art. Such conservatively 
modified variants are in addition to and do not exclude polymorphic variants, interspecies 

1 S homologs, and alleles of the invention. Typically conservative substitutions include for one 
another 1) Alanine (A), Glycine (G); 2) Aspartic acid P), Glutamic acid (E); 3) Asparagine 
(N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine 
(M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), 
Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)). 

20 Macromolecular structures such as polypq)tide structures can be described in terms of 

various levels of organization. For a gen^ discussion of this organization, see, e.g., 
Alberts, et al. (1994) Molecular Biology of the Cell (3^^ ed.) and Cantor and Schimmel (1980) 
Biophysical Chemistry Part I: The Conformation of Biol ogical Macromolecules , 'Trimaxy 
structure'' refers to the amino acid sequence of a particular peptide. "Secondary structure" 

25 refers to locally ordered, three dimensional structures within a polypeptide. These structures 
are commonly known as domains. Domains are portions of a polypeptide that often form a 
compact unit of the polypeptide and are typically 25 to approximately 500 amino acids long. 
Typical domains are made up of sections of lesser organization such as stretches of P-sheet 
and a-helices. 'Tertiary structure" refers to the complete three dimensional structure of a 

30 polypeptide monomer. "Quaternary structure" refers to the three dimensional structure 
formed, usually by the noncovalent association of independent tertiary units. Anisotropic 
terms are also known as esaergy terms. 
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'"Nucleic acid" or "oligonucleotide" or "^lynucleotide" or grammatical equivalents 
used herein means at least two nucleotides coval^tly linked together. Oligonucleotides are 
typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up 
to about 100 nucleotides in length. Nucleic acids and polynucleotides are a polymers of any 
5 length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 7000, 10,000, 
etc. A nucleic acid of the present invention will generally contain phosphodiester bonds, 
although in some cases, nucleic acid analogs are included that may have at least one different 
linkage, e.g., phosphoramidate, phosphorotbioate, phosphorodithioate, or O- 
methylphophoroamidite linkages (see Eckstein (1992) Oligonucleotides and Analogues: A 

10 Practical Approach Oxford University Press); and peptide nucleic acid backbones and 
linkages. Other analog nucleic acids include those with positive backbones; non-ionic 
backbones, and non-ribose backbones, including those described in U.S. Patent Nos. 
5,235,033 and 5,034,506, and Chapters 6 and 7, in Sanghui and Cook, eds. Carbohydrate 
Modifications in Antisense Researctu ASC Symposium Series 580. Nucleic acids containing 

1 5 one or more caibocyclic sugars are also included within one definition of nucleic acids. 

Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to 
increase the stability and half-life of such molecules in physiological environments or as 
probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; 
alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring 

20 nucleic acids and analogs may be made. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide nucleic 
acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occuiring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 

25 kinetics. PNAs have larger changes in the melting temperature (TnO for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4° C drop in Tm for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9° C. 
Similarly, due to their non-ionic nature, hybridization of the bases attached to these 
backbones is relatively insensitive to salt concentration. In addition, PNAs are not degraded 

30 by cellular enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or contain 
portions of both double stranded or single stranded sequence. As will be sqipreciated by those 
in the art, the dq)iction of a single strand also defines the sequence of the complem^tary 
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Strand; thus &e sequences described herein also provide the complrairat of the sequence. 
The nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the 
nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations 
of bases, including uracil, adenine, tiiymine, cytosine, guanine, inosine, xandiine 
5 hypoxanthine, isocytosine, isoguanine, etc. **Transcript" typically refers to a naturally 
occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA. As used herein, the terai 
**nucIeoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 
nucleosides such as anuno modified nucleosides. In addition, '^nucleoside" includes non- 
naturally occurring analog structures. Thus, e.g., the individual units of a pq)tide nucleic 

10 acid, each containing a base, are referred to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by spectroscopic, 
photochemical, biochemical, immunochemical, physiological, chemical, or other physical 
means. For example, usefiil labels include ^^P, fluorescent dyes, electron-dense reagents, 
enzymes (e.g., as conunonly used in an ELISA), biotin, digoxigenin, or haptens and proteins 

IS or other entities which can be made detectable, e.g., by incorporating a radiolabel into the 
peptide or used to detect antibodies specifically reactive with the peptide. The labels may be 
incorporated into the cancer nucleic acids, proteins, and antibodies. Many methods known in 
the art for conjugating the antibody to the label may be employed, including those metihods 
described by Hunter, et al. (1962) Nature 144:945; David, et aL (1974) Biochemistry 

20 13:1014-1021; Pain, et al. (1981) J. Immunol. Meth.. 40:219-230; and Nygren (1982) I 
Histochem. ^d Cytoc)iem, 30:407-412. 

An "effector" or "effector moiety** or "effector component is a molecule that is 
bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 

25 The "effector" can be a variety of molecules including, e.g., detection moieties including 

radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 
tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting *Tiard" e.g., beta radiation. 

A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 

30 covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presOTce of the probe may be 
detected by detecting the presence of Hie label bound to the probe. Alternatively, method 
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using high afiBnity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

As used herein a 'Nucleic acid probe or oligonucleotide" is a nucleic acid enable of 
binding to a target nucleic acid of complementary sequence through one or more types of 
5 chemical bonds, usually through complementary base pairing, e.g., through hydrogen bond 
formation. As used herein, a probe may include natural (i.e.. A, G, C, or T) or modified bases 
(7-deazaguanosine, inosine, etc.). In addition, the bases in a probe may be joined by a 
linkage other than a phosphodiester bond, preferably one that does not functionally interfere 
with hybridization. Thus, e.g., probes may be peptide nucleic acids in which the constituent 

10 bases are joined by peptide bonds rather than phosphodiester linkages. Probes may bind 

target sequences lacking complete complementarity with the probe sequence depending upon 
the stringency of the hybridization conditions. The probes are preferably directly labeled, 
e.g., with isotopes, chromophores, lumiphores, chromogens, or indirectly labeled, e.g., with 
biotin to which a streptavidin complex may later bind By assaying for Ihe presence or 

1 S absence of tiiie probe, one can detect the presence or absence of the select sequence or 

subsequence. Diagnosis or prognoisis may be based at the genomic level, or at the level of 
SNA or protein e?q)ression. 

The term ""recombinant" when used with reference, e.g., to a cell, or nucleic acid, 
protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by 

20 the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic 
* acid or protein, or that the cell is derived fiom a cell so modified. Thus, e.g., recombinant 
cells express genes that are not found within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all. By the term 'Recombinant nucleic acid" herein is meant nucleic acid, 

25 originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 
polymerases and endonucleases, in a form not normally found in nature. In this manner, 
operably linkage of different sequences is achieved. Thus an isolated nucleic acid, in a linear 
form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this inventiorL It is 

30 understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the 
host ceU rather than in vitro manipulations; however, such nucleic acids, once produced 
. recombinantly, although subsequently replicated non-recombinantly, are still considered 
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. recombinant for the pmposes of the inventioa Similarly, a "^recombinant protein" is a protein 
made using recombinant techniques, i.e., through the expression of a recombinant nucleic 
acid as depicted above. 

The term 'lieterologous" when used with reference to portions of a nucleic acid 
5 indicates that the nucleic acid comprises two or more subsequences that are not normally 
foxmd in the same relationship to each other in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 
arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 
coding region &om another source. Similarly, a heterologous protein will often ref^ to two 
10 or more subsequences that are not found in the same relationship to each other in nature (e.g., 
a fusion protein). 

A **promoter" is typically an array of nucleic acid control sequences that direct 
transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid 
sequences near the start site of transcription, such as, in the case of a polymerase n type 

IS promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elCTtents, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An ""inducible" promoter is a promoter that is 
active und^ environmental or developmental regulation. The term "operably linked" refers 

20 to a functional linkage between a nucleic acid expression control sequence (such as a 

promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
e.g., wherein the egression control sequence directs transcription of flie nucleic acid 
corresponding to the second sequence. 

An "expression vector" is a nucleic acid construct, generated recombinantly or 

25 synthetically, with a series of specified nucleic acid elements that permit transcription of a 
particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed in operable linkage to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 

30 duplexing, or hybridizing of a molecule selectively to a particular nucleotide sequence under 
stringent hybridization conditions when that sequence is present m a complex mixture (e.g., 
total cellular or Ubrary DNA or RNA). 
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The phrase "stringent hybridization conditions" refers to conditions undo: which a 

probe will hybridize to its target subsequence, typically in a coiiq)lex mixture of nucldc 

adds, but to essentially no other sequoices. Stringent conditions are sequence-dependent and 

will be different ia different circumstances. Longer sequences hybridize specifically at 

S higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 

"Ovoriew of principles of hybridization and the strategy of nucleic acid assays" in Tijssen 

(1993) Techni ques in Biochemistry and Molecular Biology-Hvbridization with Nucleic 

Probes (vol. 24) Elsevier. Generally, stringent conditions are selected to be about 5-10° C 

lower than the thermal melting point (TnO for the specific sequence at a defined ionic strength 

10 pH, The Tm is the temperature (under defined ionic strength, pH, and nucleic concentration) 
at which 50% of the probes complementary to the target hybridize to the target sequence at 
equihbrium (as the target sequences are present in excess, at Tm, 50% of the probes are 
occupied at equilibriimi). Stringent conditions will be those in which the salt concentration is 
less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or 

1 5 otiier salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for short probes (e.g., 
10 to SO nucleotides) and at least about 60° C for long probes (e.g., greater than 50 
nucleotides). Stringent conditions may also be achieved with the addition of destabilizing 
agents such as formamide. For selective or specific hybridization, a positive signal is 
typically at least two times background, preferably 10 times background hybridization. 

20 Exemplary stringent hybridization conditions are often: 50% formamide, 5x SSC, and 1% 
SDS, incubating at 42° C, or, 5x SSC, 1% SDS, incubating at 65° C, with wash in 0.2x SSC, 
and 0.1% SDS at 65° C. For PGR, a temperature of about 36° C is typical for low stringency 
amplification, although amiealing temperatures may vary betwe^i about 32® C and 48° C 
depending on primer length. For high stringracy PGR amplification, a temperature of about 

25 62® C is typical, although high stringency annealing temperatures can range fi-om about 50° C 
to about 65° C, depending on the prLcner length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90° C - 95° C for 
0.5 - 2 min., an annealing phase lasting 0.5 - 2 min., and an extension phase of about 72° C 
for 1 - 2 min. Protocols and guidelines for low and high stringency amplification reactions 

30 are provided, e.g., in Imiis, et al.(1990) fgR, fyotoco^, A guide to Methods and 
Applications. 

Nucleic acids that do not hybridize to each other under stringent conditions are still 
substantially identical if the polypeptides which they encode are substantially idmtical. This 
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occurs, e.g., when a copy of a nucleic acid is created using the Tnaxitnum codon degeneracy 
pennitted by the genetic code. In such cases, the nucleic acids typically hybridize under 
moderately stringent hybridization conditions. Exemplary •*moderately stringent 
hybridization conditions" include a hybridization in a buffer of 40% formamide^ 1 M NaCl, 
5 1% SDS at 3T C, and a wash in IX SSC at 45° C. A positive hybridization is at least twice 
background. Alternative hybridization and wash conditions can be utilized to provide 
conditions of similar stringency. Additional guidelines for deteimining hybridization 
parameters are provided in numerous reference, e.g., Ausubel, et al. (ed.) Current Protocols in 
Molecular Biology Lippincott. 

10 The phrase "functional effects" in the context of assays for testing compounds that 

modulate activity of a lung cancer protein includes the determination of a parameter that is 
indirectly or directly under the influence of the lung cancer protein or nucleic acid, e.g., a 
physiological, enzymatic, jfunctional, physical, or chemical effect, such as the ability to 
decrease lung cancer. It includes ligand binding activity; cell viability, cell growth on soft 

1 S agai^ anchorage dependence; contact inhibition and density limitation of growth; cellular 
proliferation; cellular transformation; growth &ctor or serum dependence; tumor specific 
marker levels; invasiveness into Matrigel; tumor growA and metastasis in vivo; mRNA and 
protein expression in cells undergoing metastasis, and other characteristics of lung cancer 
cells. 'Tunctional effects" include in vitrOy in vivo^ and ex vivo activities. 

20 By "determining the functional effect" is meant assaying for a compound that 

increases or decreases a parameter that is indirectly or directly und^ the influence of a lung 
cancer protein sequence, e.g., physiological, functional, enzymatic, physical, or chemical 
effects. Such functional effects can be measured by many means known to those skilled in 
the art, e.g., changes in spectroscopic characteristics (e.g., fluorescence, absorbance, 

25 refractive index), hydrodynamic (e.g., sh^e), chromatographic, or solubiUty properties for 
the protein, measuring inducible maricers or transcriptional activation of the Ixmg cancer 
protein; measuring binding activity or binding assays, e.g., binding to antibodies or other 
ligands, and measuring cellular proliferation. Detennination of the ftmctional effect of a 
compound on lung cancer can also be performed using lung cancer assays known to those of 

30 skill in the art such as an in vitro assays, e.g., cell growth on soft agar, anchorage 

dependence; contact inhibition and density limitation of growth; cellular proliferation; 
cellular transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vzvo; mSNA and protein 
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expression in cells undergoing metastasis, and other charact^istics of lung cancer cells. The 

functional effects can be evaluated by many means known to those skilled in tihie art, e.g., 

microscopy for quantitative or qualitative measures of alterations in moiphological features, 

measurement of changes in KNA or protein levels for lung cancer-associated sequences, 

5 measurement of RNA stability, identification of downstream or reporter gene expression 

(CAT, luciferase, P-gal, GFP, and the like), e.g., via chemiluminescence, fluorescence, 

colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

"Inhibitors", "activators", and "modulators" of lung cancer polynucleotide and 

polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules or 

10 compounds identified using in vitro and in vivo assays of lung cancer polynucleotide and 
polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally block 
activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate the 
activity or expression of lung canc^ protems, e.g., antagonists. Antisense or inhibitory 
nucleic acids may seem to inhibit expression and subsequent function of the protein. 

IS "Activators" are compounds fliat mcrease, open, activate, facilitate, enhance activation, 
sensitize, agonize, or up regulate lung cancer protein activity. Inhibitors, activators, or 
modulators also include genetically modified versions of lung cancer proteins, e.g., versions 
with altered activity, as well as naturally occurring and synthetic ligands, antagonists, 
agonists, antibodies, small chemical molecules and the like. Such assays for inhibitors and 

20 activators include, e.g., expressing the lung cancer protein in vitro, in cells, or cell 

membranes, applying putative modulator compounds, and then determining the functional 
effects on activity, as described above. Activators and inhibitors of lung cancer can also be 
identified by incubating lung cancer cells with the test compound and determining increases 
or decreases in the expression of 1 or more lung cancer proteins, e.g., 1, 2, 3, 4, 5, 10, 15, 20, 

25 25, 30, 40, 50 or more lung cancer proteins, such as lung cancer proteins encoded by the 
sequences set out in Tables lA-16. 

Samples or assays comprising lung cancer proteins that are treated with a potential 
activator, inhibitor, or modxilator are compared to control samples without the inhibitor, 
activator, or modulator to examine the extent of inhibition. Control samples (untreated with 

30 inhibitors) are assigned a relative protein activity value of 1 00%. Inhibition of a polypeptide 
is achieved when the activity value relative to the control is about 80%, prefarably 50%, more 
preferably 25-0%. Activation of a lung cancer polypeptide is achieved when the activity 
value relative to the control (untreated with activators) is 1 10%, more preferably 150%, more 
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preferably 200-500% (i.e,, two to five fold higher relative to the control), more preferably 
1000-3000% higher. 

The phrase "changes in cell growfli" refers to any change in cell growfli and 
proliferation characteristics in vitro or in vivo, such as cell viability, formation of foci, 
5 anchorage independence, sraii-solid or soft agar growth, changes in contact inhibition and 
density limitation of growth, loss of growth factor or serum reqxiirements, changes in cell 
morphology, gaining or losing immortalization, gaining or losing tmnor specific markers, 
ability to form or suppress tumors when injected into suitable animal hosts, and/or 
immortalization of the cell. See, e.g., Freshney (1994) Culture of A nimal Tell s a Manual of 

10 Basic Technioue pp. 231-241 (3"^ ed.). 

*Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor. 
"Cancer cells," 'transformed" cells, or ^transformation" in tissue culture, refers to 
spontaneous or induced phenotypic changes that do not necessarily involve the tqjtake of new 
genetic material Altiiough transformation can arise firom infection with a transforming virus 

1 S and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 
spontaneously or following exposure to a carcinogen, fliereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 
aberrant growfli control, nonmorphological changes, and/or malignancy (see, Freshney 
(1994) Hiilhirft nf ,^niTn al Cells a Manual of Basic Technioue (3"^ ed.)). 

20 "Antibody" refers to a polypeptide conq>rising a framework region firom an 

immunoglobulin gene or fiagments thereof that specifically binds and recognizes an antigen. 
The recognized immimoglobulin genes include the kappa, lambda, alpha, gamma, delta, 
q>silon, and mu constant region genes, as well as the myriad inomunoglobulin variable region 
genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as 

25 gamma, mu, alpha, delta, or q)silon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD, and IgE, re^ectively. Typically, the antigen-binding region of an antibody or 
its fimctional equivalent will be most critical in specificity and affinity of binding. See Paul, 

Fundamental Tmmimnlntry . 

An exemplary immimoglobulin (antibody) structural imit comprises a tetramer. Each 
30 tetramer is composed of two identical pairs of polypeptide chains, each pair having one 
"lighf (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus of each 
chain defines a variable region of about 100 to 110 or more amino acids primarily responsible 
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for antigen recognition. The terms variable light chain (Vl) and variable heavy chain (Vh) 
refer to these light and heavy chains respectively. 

Antibodies exist, e.g., as intact umnunoglobulins or as a number of well-characterized 
fragments produced by digestion with various peptidases. Thus, e.g., pepsin digests an 
5 antibody below the disulfide linkages in the hinge region to produce F(ab)*2, a dimer of Fab 
which itself is a light chain joined to Vh-Ch1 by a disulfide bond. The F(ab)*2 may be 
reduced under mild conditions to break the disulfide linkage in the hinge region, thereby 
converting the F(ab)'2 dimer into an Fab' monomer. The Fab* monomer is essentially Fab 
with part of the hinge region (see Paul (ed. 1999) Fundament al Tmrnnnnlop y (4th ed.). While 

10 various antibody Augments are defined in temis of the digestion of an intact antibody, one of 
slHIl will ^preciate that such fragments may be synthesized de novo either chemically or by 
using recombinant DNA methodology. Thus, the term antibody, as used herein, also includes 
antibody fragments either produced by the modification of whole antibodies, or those 
synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those 

15 identified using phage display Ubraries (see, e.g., McCafiferty, et al. (1990) Nature 348:552- 
554). 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler and Milstein 
(1975) Nature 256:495-497; Kozbor, et al. (1983) TmmiinnlQpv Tndav 4:72; Cole, et al. 

20 (1985), pp. 77-96 in Monoclonal Antibodies and Cancer Theranv: CoUgan (1991 and 
supplements) Current Prot^ cnlR in Tmnrnnolopv! Harlow and Lane (1988) Antibodies. A 
Laboratory Manual: and Goding (1986) Monoclonal Antibodies: Principles and Practice (2d 
ed.)). Techniques for the production of single chain antibodies (U.S. Patent 4,946,778) can 
be adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or 

25 other organisms such as other m amm als^ may be used to express humanized antibodies. 
Alternatively, phage display technology can be used to identify antibodies and heteromeric 
Fab firagments that specifically bind to selected antigens (see, e.g., McCafferty, et al. (1990) 
Nature 348:552-554; Marks, et al. (1992) Biotechnologv 10:779-783). 

A "chimeric antibody** is an antibody molecule in which, e.g, (a) the constant region, 

30 or a portion thereof; is altered, replaced, or exchanged so that the antigen binding site 
(variable region) is linked to a constant region of a different or altered class, effector 
fimction, and/or species, or an entirely diffo^nt molecule which confers new properties to the 
chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the 
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variable region, or a portion thereof; is altered, replaced, or exchanged with a variable region 
having a different or altered antigen specificity. 



Identification of lung cancer-associated sequences 

5 In one aspect, the expression levels of genes are determined in different patient 

samples for which diagnosis information is desired, to provide expression profiles. An 
expression profile of a particular sample is essentially a "fingerprint" of the state of the 
sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 

10 characteristic of the state of the cell. That is, normal tissue may be distinguished firom 
cancerous or metastatic cancerous tissue, or metastatic cancerous tissue can be compared 
with tissue from surviving cancer patients. By comparing e7q)ression profiles of tissue in 
known different lung cancer states, information regarding which genes are important 
(including both iip- and down-regulation of genes) in each of these states is obtained. 

1 5 Molecular profiling may distinguish subtypes of a currently collective disease designation, 
e.g., different forms of lung cancer (chronic disease, adenocarcinoma, etc.) 

The identification of sequences that are differentially expressed in lung cancer versus 
non-lung cancer tissue allows the use of this information in a number of ways. For example, 
a particular treatment regime may be evaluated: does a chemotiierapeutic drug act to down- 

20 regulate lung cancer, and thus tumor growth or recurrence, in a particular patient. 

Alternatively, a treatmoit step may induce other markers which may be used as targets to 
destroy tumor cells. Similarly, diagnosis and treatment outcomes may be done or confirmed 
by comparing patient sany)les with the known expression profiles. Malignant diseasemay be 
compared to non-malignant conditions. Metastatic tissue can also be analyzed to determine 

25 the stage of lung cancer in the tissue, or origin of primary tumor, e.g., metastasis firom a 

remote primary site. Furthermore, these gene expression profiles (or individual genes) allow 
screening of drug candidates with an eye to nmnicking or altering a particular e:q)ression 
profile; e.g., screening can be done for drugs that suppress the lung cancer expression profile. 
This may be done by making biochips comprising sets of the important lung cancer genes, 

30 which can then be used in these screens. PGR methods may be applied with selected primer 
pairs, and analysis may be of RNA or of genomic sequences. These metiiods can also be 
done on the protein basis; that is, protein expression levels of the lung cancer proteins can be 
evaluated for diagnostic purposes or to screen candidate agents, hi addition, the lung cancer 
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nucleic acid sequences can be administered for gene therapy purposes, including tiie 
administration of aiitisense nucleic acids, or the lung cancer protems (including antibodies 
and other modulators th^eof) administered as therapeutic drugs or as protein or DNA 
vaccines. 

5 Thus the present invention provides nucleic acid and protein sequences tiiat are 

diflferentially expressed in lung cancer relative to normal tissues and/or non-malignant lung 
disease, or in different types of lung disease, herein termed "lung cancer sequences." As 
outlined below, Ixmg cancer sequences include those that are up-regulated (i.e., expressed at a 
higher level) in lung cancer, as well as those tiiat are down-regulated (i.e., expressed at a 

10 lower level). In a preferred embodiment, the lung cancer sequences are from humans; 
however, as will be appreciated by those in the art, lung cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, other lung 
cancer sequences are provided, from vertebrates, including mammals, including rodents (rats, 
mice, hamsters, guinea pip, etc.), primates, feim animals (including sheep, goats, pigs, cows, 

15 horses, etc.) and pets (dogs, cats, etc.). Lung cancer sequences from other organisms may be 
obtained using &e techniques outlined below. 

Lung cancer sequences can include both nucleic acid and amino acid sequences. As 
will be appreciated by those in the art and is more fiilly outlined below, lung cancer nucleic 
. acid sequences are useful in a variety of applications, including diagnostic applications, 

20 which will detect naturally occurring nucleic acids, as well as screening applications; e.g., 
biochips comprising nucldc acid probes or PGR microtiter plates with selected probes to the 
lung cancer sequences can be generated. 

A lung cancer sequence can be initially identified by substantial nucleic acid and/or 
amino acid sequence homology to the lung cancer sequences outlined herein. Such 

25 homology can be based upon the overall nucleic acid or amino acid sequence, and is 

generally determined as outlined below, e.g., using homology programs or hybridization 
conditions. 

For identifying lung cancer-associated sequences, the lung cancer screen typically 
includes comparing genes identified in different tissues, e.g., normal and cancerous tissues, 
30 cancer and non-malignant conditions, non-malignant conditions and normal tissues, or tumor 
tissue samples &om patients who have metastatic disease vs. non metastatic tissue. Other 
suitable tissue comparisons include comparing lung cancer samples with metastatic cancer 
samples from other cancers, such as, breast, other gastrointestinal cancers, prostate, ovarian. 
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etc. Samples non metastatic disease tissue and tissue undergoing metastasis are applied to 
biochips conq)rising nucleic acid probes. The samples are first miax)dissected, if ^plicable, 
and treated as is known in the art for the preparation of mRNA. Suitable biochips are 
commercially available, e.g., fix)m Aflfymetrix, Santa Clara, CA. Gene expression profiles as 
S described herein are generated and the data analyzed. 

la one embodiment, ttie genes showing changes in expression as between normal and 
disease states are compared to genes expressed in other normal tissues, preferably normal 
lung, but also including, and not limited to colon, heart, brain, liver, breast, kidney, muscle, 
prostate, small intestine, large intestine, spleen, bone, and/or placenta. In a preferred 
10 embodiment, those genes identified dxiring the limg cancer screen that are expressed in 
significant amounts in other tissues (e.g., essential organs) are removed from the profile, 
although m some embodiments, this is not necessary (e.g., where organs may be dispeasible 
at a later stage of life). That is, when scre^ung for drugs, it is usually preferable that flie 
target e3q)ression be disease specific, to minimize possible side effects on other organs. 
15 In a preferred embodiment, lung cancer sequences are those that are up-regulated in 

limg cancer; that is, the expression of these genes is higher in cancerous tissue than in normal 
lung or other tissue. '*Up-regulation" as used herein means, when the ratio is presented as a 
number greater than one, fbat the ratio is greater than one, preferably 1,5 or greater, more 
preferably 2.0 or greater. Another embodiment is directed to sequences up-regulated in non- 
20 malignant conditions relative to normal. Unigene cluster identification numbers and 

accession numbers herein are for the GenBank sequence database and the sequences of the 
accession numb^ are hereby esipressly incorporated by reference. GenBank is known in the 
art, see, e.g., Benson, DA, et al (1998) Nucleic Acids Research 26:1-7 and 
http://www.ncbi.nlm.nih.gov/. Sequences are also available in other databases, e.g., 
25 European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDB J). 
Another embodiment is directed to sequences up-regulated in non-malignant conditions 
relative to normal. In some situations, the sequences may be derived from assembly of 
available sequences or be predicted from genomic DNA using exon prediction algorithms, 
such as FGENESH (Salamov and Solovyev (2000) Genome Res. 10:516-522). In other 
30 situations, sequences have been derived from cloning and sequencing of isolated nucleic 
acids. 

In another preferred embodiment, lung cancer sequences are those that are do^- 
regulated in the lung cancer, that is, the e?q;)ression of these genes is lower in cancerous tissue 
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or normal lung or other tissue. ''Down-regulation" as used herein means, \^en the ratio is 

presented as a number greater than one, that the ratio is greater than one, prefmbly 1 .5 or 

greater, more preferably 2.0 or greater, or, when the ratio is presented as a number less tban 

one, that the ratio is less than one, preferably 0.5 or less, more preferably 0.25 or less. 

Informatics 

The ability to identify genes that are over or under expressed in lung cancer can 
additionally provide high-resolution, high-sensitivity datasets which can be used in the areas 
of diagnostics, th^apeutics, drug developmoat, pharmacogenetics, protein structure, 
biosensor development, and other related areas. For example, the expression profiles can be 
used in diagnostic or prognostic evaluation of patients with lung cancer. Or as another 
example, subcellular toxicological information can be graerated to better direct drug structure 
and activity correlation (see Anderson (1998) Phannaceutical Proteomics: Targets. 
Mfifthanism. and Function, paper presCTted at &e IBC Proteomics conference, Coronado, CA 
(June 1 1-12, 1998)). Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,811,231). Similar advantages accrue 
fiom datasets relevant to o&er biomolecules and bioactive agents (e.g., nucleic acids, 
saccharides, lipids, drugs, and the like). 

Thus, in another embodiment, the present invention provides a database that includes 
at least one set of assay data. The data contained in the database is acquired, e.g., using array 
analysis either singly or in a library fonnat. The database can be in a form in which data can 
be maintained and transmitted, but is preferably an electronic database. The electronic 
database of the invention can be maintained on any electronic device allowing for the storage 
of and access to the database, such as a personal computer, but is preferably distributed on a 
wide area network, such as the World Wide Web. 

The focus of the present section on databases that include peptide sequence data is for 
clarity of illustration only. It will be ^parent to those of skill in the art that sinular databases 
can be assembled for assay data acquired using an assay of the invention. 

The compositions and methods for identifying and/or quantitating the relative and/or 
absolute abundance of a variety of molecular and macromolecular species firom a biological 
sample representing lung cancer, i.e., the identification of lung cancer-associated sequences 
described herein, provide an abundance of information, which can be correlated with 
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paQiological conditions, predisposition to disease, drug testing, therapeutic monitoring, gene- 
disease causal linkages, identification of correlates of immunity and physiological status, 
among others. Although ttie data generated fiom the assays of the invention is suited for 
manual review and analysis, in a preferred embodiment, data processuig using hig^-speed 
5 computers is utilized. 

An array of methods for indexing and retrieving biomolecular information is known 
in the art For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational database 
system for storing biomolecular sequence information in a manner that allows sequences to 
be catalogued and searched according to one or more protein function hierarchies. U.S. 

10 Patent 5,953,727 discloses a relational database having sequence records containing 
information in a format that allows a collection of partial-length DNA sequences to be 
catalogued and searched according to association with one or more sequencing projects for 
obtaining full-length sequences from the collection of partial length sequences. U.S. Patent 
5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 

1 5 sequence similar to a sequence data item in a gene database based on the degree of similarity 
between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in conq>uter databases by comparison of predicted mass spectra with ^eiimentally-deiived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 

20 dimensional database comprising a functionality for multi-dimensional data analysis 
described as on-line analytical processing (OLAP), which entails the consolidation of 
projected and actual data according to more than one consolidation path or dimrasion. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 

25 fields stored in a hierarchical topological map which can be viewed as a tree structure or as 
the merger of two or more such tree structures. 

See also Mount, et al. (2001) Bioinformatics; Durbin, et al. (eds., 1999) Biological 
Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (; Baxevanis and 
Oeullette (eds., 1998) Bioinformatics: A Practical Guide to the Analysis of Genes and 

30 ftoteins); Rashidi and Buehler (1999) Bioinformatics: Basic Applications m Biological 
Science andMedicme: Setubal, et al. (eds 1997) Introduction to Computational Molecular 
Biology : Misener and Krawetz (eds, 2000) Bioinformatics: Methods and Protocols: Higgins 
and Taylor (eds., 2000) Bioinformatics: Sequence^ Structure, and Databanks: A Practical 
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Approach: Brown (2001) Bioinformatics: A Biologist's Guide to Biocomputing and the 

Internet: Han and Kamber (2000) Data Mining Concepts a tirt TechTiignes (2000); and 

Waterman (1995) Introduction to Computational Biolopv: Maps. Sequences, and Genomes . 

The present invention provides a computer database comprising a computer and 

5 software for storing in computer-retrievable forai assay data records cross-tabulated, e.g., 

with data specifying the source of the target-containing sample from which each sequence 

specificity record was obtained. 

hi an exemplary embodiment, at least one of the sources of target-containing sample 

is firom a control tissue sample known to be fi'ee of pathological disorders. In a variation, at 

10 least one of the sources is a known pathological tissue specimen, e.g., a neoplastic lesion or 
another tissue specimen to be analyzed for lung cancer. In another variation, the assay 
records cross-tabulate one or more of the following parameters for each target species in a 
sample: (1) a unique identification code, which can include, e.g., a target molecular structure 
and/or characteristic separation coordinate (e.g., electrophoretic coordinates); (2) sample 

IS source; and (3) absolute and/or relative quantity of the target species present in tiie sample. 

The invention also provides for the storage and retrieval of a collection of target data 
in a conq>uter data storage q)paratus, which can include magnetic disks, optical disks, 
magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, magnetic 
bubble memory devices, and other data storage devices, including CPU regist^ and on-CPU 

20 data storage arrays. Typically, the target data records are stored as a bit pattem in an array of 
magnetic domains on a magnetizable medium or as an array of chaige states or transistor gate 
states, such as an array of cells in a DRAM device (e.g., each cell comprised of a transistor 
and a charge storage area, which may be on the transistor). In one embodiment, the invention 
provides such storage devices, and computer systems built therewith, comprising a bit pattem 

25 encoding a protein expression fingerprint record comprising unique identifiers for at least 10 
target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides a 
method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 

30 or retrieved fi'om a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or conq)uter program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTFIT) and/or the comparison may 
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be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
detennined fix>m a polypeptide or nucleic add sanq}Ie of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM-compatible 
(DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format (e.g., Linux, 
5 SunOS, Solaris, AIX, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or hard (fixed, 
Winchester) disk drive, comprising a bit pattem encoding data &om an assay of the invention 
in a file fomiat suitable for retrieval and processing in a computerized sequence analysis, 
comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing devices 

10 linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, ISDN 

line, wireless network, optical fiber, or other sxiitable signal transmission medium, whereby at 
least one network device (e.g., computer, disk array, etc.) comprises a pattem of magnetic 
domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM cells) 
composing a bit pattem encoding data acquired &om an assay of the invention. 

1 S The invention also provides a method for transmitting assay data that includes 

generating an electronic signal on an electronic cormnunications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattem encoding data firom an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

20 hi a preferred embodiment, the invention provides a computer system for comparing a 

query target to a database containing an array of data stmctures, such as an assay result 
obtained by the method of fhe invention, and raxddng database targets based on the degree of 
identity and gap weight to the target data. A central processor is preferably initialized to load 
and execute the computer program for alignment and/or comparison of the assay results. 

25 Data for a query target is entered into tiie central processor via an I/O device. Execution of 
the computer program results in the central processor retrieving the assay data firom the data 
file, which comprises a binary description of an assay result 

The target data or record and the computer program can be transferred to secondary 
memory, which is typically random access memory (e.g., DRAM, SRAM, SGRAM, or 

30 SDRAM). Targets are ranked according to the degree of correspondence between a selected 
assay characteristic (e.g., binding to a selected afSnity moiety) and the same characteristic of 
the query target and results are output via an I/O device. For ^cample, a central processor 
can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, PA-8000, SPARC, 
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MIPS 4400, MIPS 10000, VAX, etp.); a program can be a commercial or public domain 
molecular biology software package (e.g., UWGCG Sequence Analysis Software, Darwin); a 
data file can be an optical or magnetic disk, a data server, a memory device (e.g., DRAM, 
SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash niemory, etc.); an I/O device can 
5 be a terminal comprising a video display and a keyboard, a modem, an ISDN terminal 

adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or other suitable I/O 
device. 

The invention also preferably provides the use of a computer system, such as that 
described above, which con^rises: (1) a computer; (2) a stored bit pattern encoding a 
10 collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer, (3) a comparison target, such as a query target; and (4) 
a program for aligmnent and comparison, typically with rank-ordwing of comparison results 
on the basis of coxxiputed similarity values. 

1 5 Characteristics of lung cancer-associated proteins 

Limg cancer proteins of the present invention may be classified as secreted proteins, 
transmembrane proteins or intracellular protdns. In one embodiment, the lung cancer protem 
is an intracellular protein. Intracellular proteins may be found in the cytoplasm and/or in the 
nucleus. Intracellular proteins are involved in all aspects of cellular fimction and replication 

20 (including, e.g., signaling pathways); aberrant expression of such protems often results in 
unregulated or disregulated cellular processes (see, e.g., Alberts (ed.. 1994) Molecular 
Biology of the Cell f 3d ed). For example, many intracellular protems have enzymatic 
activity such as protem kinase activity, protem pho^hatase activity, protease activity, 
nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins also serve 

25 as docking proteins that are involved in organizing complexes of proteins, or targeting 
proteins to various subcellular localizations^ and are involved in maintaining the structural 
integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence in the 
proteins of one or more structural motifs for which defined fimctions have been attributed. In 

30 addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
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domains, also bind tyrosine phosphoiylated targets. SID domains bind to proline-rich 
targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 
few, have bem shown to mediate protein-protein interactions. Some of tibtese may also be ' 
involved in binding to phospholipids or other second messengers. As will be appreciated by 
5 one of ordinary skill in the art, these moti& can be identified on Ifae basis of amino acid 
sequence; thus, an analysis of the sequence of proteins may provide insight into both the 
enzymatic potential of the molecule and/or molecules with which the protein may associate. 
One useful database is Pfam (protein famiHes), which is a large collection of multiple 
sequence aUgnments and hidden Markov models covering many common protein domains. 
10 Versions are available via the internet from Washington University in St Louis, the Sanger 
Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman, et al (2000) 
Nuc. Acids Res. 28:263-266; Sonnhammer, et al. (1997) Proteins 28:405-420; Bateman, et al. 
(1999) Nuc. Acids Res, 27:260-262; and Sonnhammer, et al. (1998) Nuc. Acids Res. 26:320- 
322). 

IS In another ^nbodiment, the lung cancer sequences are transmembrane proteins. 

Transmmbrane proteins are molecules that span a phospholipid bilayer of a celL They may 
have an intracellular domain, an extracellular domain, or botL The intracellular domains of 
such proteins may have a number of functions including those already described for 
intracellular proteins. For example, the intracellular domain may have enzymatic activity 

20 and/or may serve as a binding site for additional proteins. Frequently the intracellular 

domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have botii protein kinase activity and SH2 domains. In addition, autophosphorylation 
of tyrosines on the receptor molecule itself creates binding sites for additional SH2 domain 
containing proteins. 

25 Transmembrane proteins may contain from one to many transmembrane domains. 

For example, receptor tyrosine kinases, certain cytokine receptors, receptor guanylyl cyclases 
and receptor serine/threonine protein kinases contain a single transmembrane domain. 
However, various other proteins including channels, pumps, and adenylyl cyclases contain 
numerous transmembrane domains. Many important cell surface recq)tors such as G protein 

30 coupled receptors (GPCRs) are classified as "seven transmembrane domain" proteins, as they 
contain 7 membrane spaiming regions. Characteristics of transm^brane domains include 
^proximately 17 consecutive hydrophobic amino acids that may be followed by charged 
amino adds. Therefore, upon analysis of the amino acid sequence of a particular protein, the 
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localization and number of transmembrane domains within the protein may be predicted (see, 
e.g., PSORT web site http-7^sorLnibb.ac.jp/). 

The extracellular domains of transmembrane proteins are diverse; however, conserved 
moti& are found repeatedly among various extracellular domains. Conserved structure 
5 and/or functions have bem ascribed to different extracellular motifs. Many extracellular 
domains are involved in binding to other molecules. In one aspect, extracellular domains are 
found on receptors. Factors that bind the receptor domain include circulating ligands, which 
may be peptides, proteins, or small molecules such as adenosine and the like. For example, 
growth factors such as EGF, FGF, and PDGF are circulating growth factors that bind to their 

10 cognate recq)tors to initiate a variety of cellular responses. Other factors include cytokines, 
mitogenic factors, hormones, neurotrophic factors and the like. Extracellular domains also 
bind to cell-associated molecules. In this respect, they may mediate cell-cell interactions. 
Cell-associated ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol 
(GPI) anchor, or may themselves be transmembrane proteins. Extracellular domains may 

1 S also associate with the extracellular matrix and contribute to the maintenance of the cell 
structure. 

Lung cancer proteins that are transmembrane are particularly preferred in the present 
invention as they are readily accessible targets for extracellular immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 

20 in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ or in histological analysis. Alternatively, antibodies can also label intracellular proteins, 
in which case analytical samples are typically penneablized to provide access to intracellular 
proteins. In addition, some membrane proteins can be processed to release a soluble protdn, 
or to expose a residual fragment. Released soluble proteins may be useful diagnostic 

25 markers, processed residual protein fragments may be useful lung markers of disease. 

It will also be appreciated by those in the art that a transmembrane protein can be 
made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transihembrane proteins that have beoi made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

30 In another embodiment, the lung canca: proteins are secreted proteins; the secretion of 

which can be either constitutive or regulated. These proteins may have a signal peptide or 
signal sequence that targets &e molecule to the secretory pathway. Secreted proteins are 
involved in numerous physiological evCT.ts; e.g., if circulating, ttiey often serve to transmit 
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signals to various other cell types. The secreted protein may function in an autocrine manner 
(acting on ttie cell that secreted the fector), a paracrine maimer (acting on cells in close 
proximity to the cell that secreted the factor), an endocrine manner (acting on cells at a 
distance, e.g., secretion into flie blood stream), or exocrine (secretion, e.g., through a duct or 
5 to adjacent epithelial surface as sweat glands, sebaceous glands, pancreatic ducts, lacrimal 
glands, mammary glands, sax producing glands of the ear, etc.). Thus secreted molecules 
often find use in modulating or altering numerous aspects of physiology. Lung cancer 
proteins that are secreted proteins are particularly preferred in the present invention as they 
serve as good targets for diagnostic markers, e.g., for blood, plasma, serum, or stool tests. 
10 Those which are enzymes may be antibody or small molecule targets. Others may be useful 
as vaccine targets, e.g., via CTL mechanisms. 

Use of lung cancer nucleic acids 

As described above, lung cancer sequence is initially identified by substantial nucleic 
1 S acid and/or amino acid sequence homology or linkage to the lung cancer sequences outlined 
hmin. Such homology can be based upon the ov^ll nucleic acid or amino acid sequence, 
and is generally determined as outlined below, using either homology programs or 
hybridization conditions. Typically, linked sequences on a mSNA are found on tibte same 
molecule. 

20 The lung cancer nucleic acid sequences of the invention, e.g., the sequences in Tables 

1 A-16, can be firagments of larger genes, i.e., they are nucleic acid segments. '"Genes"' in this 
context includes coding regions, non-coding regions, and mixtures of coding and non-coding 
regions. Accordingly, as will be appreciated by those in the art, using the sequences provided 
herein, extended sequences, in either direction, of the lung cancer genes can be obtained, 

25 using techniques well known in the art for cloning either longer sequences or the full length 
sequences; see Ausubel, et al., supra. Much can be done by informatics and many sequences 
can be clustered to include multiple sequences corresponding to a single gene, e.g., systems 
such as UniGene (see, http://www.ncbi.nlm.nilLgov/UniGene/). 

Once a lung cancer nucleic acid is identified, it can be cloned and, if necessary, its 

30 constituent parts recombined to form the entire lung cancCT nucleic acid coding regions or the 
entire mRNA sequence. Once isolated from its natural source, e.g., contained within a 
plasmid or other vector or excised th^from as a linear nucleic acid segment, die 
recombinant lung cancer nucleic acid can be further-used as a probe to identify and isolate 
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other hmg cancer nucleic acids, e.g., extended coding regions. It can also be used as a 
*^recursof' nucleic add to make modified or variant lung cancer nucleic adds aad proteins. 

The luQg cancer nucleic acids of the present invention are used in several ways. In a 
first embodiment, nucleic acid probes to the lung cancer nucleic acids are made and attached 
5 to biochips to be used in screening and diagnostic methods, as outlined below, or for 
administration, e.g., for gene therq)y, RNAi, vaccine, and/or antisense ^plications. 
Alternatively, the lung cancer nucleic acids that include coding regions of lung cancer 
proteins can be put into expression vectors for the expression of lung cancer proteins, again 
for screening purposes or for administration to a patient. 

10 In a preferred embodiment, nucleic acid probes to lung cancer nucleic acids (both the 

nucleic add sequences outlined in the figures and/or the compl^ents thereof) are made. 
The nucleic acid probes attached to the biochip are designed to be substantially 
complementaiy to the lung cancer nucleic acids, i.e., flie target sequence (either the target 
sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 

1 S hybridization of the target sequence and the probes of tiie present invention occurs. As 
outlined below, this complementarity need not be perfect; ttiere may be any number of base 
pair mismatches which will interfere with hybridization between the target sequence and the 
single stranded nucleic acids of tiie present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 

20 conditions, the sequence is not a complementary target sequence. Thus, by ''substantially 
complementary" herein is meant that the probes are sufficiently complementary to tiie target 
sequences to hybridize under appropriate reaction conditions, particularly high stringency 
conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single and 

25 partially double stranded. The strandedness of the probe is dictated by the structure, 

composition, and properties of the target sequence. In general, the nucleic acid probes range 
from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 
and from about 30 to about 50 bases being particularly preferred. That is, generally 
complements of ORFs or whole genes are not used. In some embodiments, nucleic acids of 

30 lengths up to hundreds of bases can be used. 

In a preferred embodiment, more than one probe per sequence is used, with either 
overlappmg probes or probes to different sections of the target being used. That is, two, 
three, fourormoreprobes, with three being prefen^ed, are used to build in a redundancy f^ 
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particular target The probes can be overusing (i.e., have some sequence in common), or 
sq)arate. In some cases, PGR primers may be used to amplify signal for higher sensitivity. 

As will be appreciated by those in the art, nucleic adds can be attached or 
immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 
5 equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 
removal as outUned below. The binding can typically be covalent or non-covalent. By '"non- 
covalent binding" and grammatical equivalents herein is typically meant one or more of 
electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is 

10 the covalent attachment of a molecule, such as, streptavidin to the support and the non- 
covalent binding of the biotinylated probe to the streptavidin. By "covalent binding" and 
grammatical equivalents herein is meant that the two moieties, the solid support and the 
probe, are attached by at least one bond, including sigma bonds, pi bonds and coordination 
bonds. Covalent bonds can be formed directly between the probe and &e solid siq)port or can 

IS be formed by a cross linker or by inclusion of a specific reactive group on either the solid 
support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

In general, the probes are attached to a biochip in a wide variety of ways, as will be 
appreciated by those in the art. As described herein, the nucleic acids can either be 

20 synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid support" or 
other grammatical equivalents herein is meant a material ttiat can be modified for the 
attachment or association of the nucleic acid probes and is amenable to at least one detection 

25 method. Often the substrate may contain discrete individual sites appropriate for ndivitual 
partitioning and identificatioa As will be appreciated by those in the art, the numbCT of 
possible substrates are very large, and include, but are not limited to, glass and modified or 
fimctionalized glass, plastics (including acryUcs, polystyrene and copolymers of styrene and 
other materials, polypropylene, polyethylene, polybutylene, polyiurethanes. Teflon, etc.), 

30 polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including 
silicon and modified silicon, caibon, metals, inorganic glasses, plastics, etc. In general, the 
substrates allow optical detection and do not appreciably fluoresce. A preferred substrate is 
described in US plication entitled Reusable Low Fluorescent Plastic Biochip, U.S. 
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Application Serial No. 09/270,214, filed March IS, 1999, herein incozporated by refer^ce in 
its entirety. 

Genially the substrate is planar, although as will be appreciated by those in flie art, 
other configurations of substrates may be used as well. For example, the probes may be 
5 placed on the inside surface of a tube, for flow-through sample analysis to minimize sample 
volume. Similarly, the substrate may be flexible, such as a flexible foam, including closed 
cell foams made of particular plastics. 

In a preferred embodiment, the surface of the biochip and the probe may be 
derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 

10 the biochip is derivatized with a chemical fimctional group including, but not limited to, 
amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using these fimctional groups, the probes can be attached using 
fimctional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g., using linkers as are known in the art; e.g., 

15 homo-or hetero-bifimctional linkers as are well known (see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages lSS-200). In addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used. 

In this embodiment, oligonucleotides are synthesized, and then attached to the sur&ce 
20 of the solid support Either the S' or 3' terminus may be attached to the solid support, or 

attachment may be via linkage to an internal nucleoside. 

In another embodiment, the immobilization to fiie solid siq>port may be very strong, 

yet non-covalent. For example, biotinylated oligonucleotides can be made, which bind to 

surfiices covalently coated with streptavidin, resulting in attachment. 
25 Alternatively, the oligonucleotides may be synthesized on the surface, as is known in 

the art. For example, photoactivation techniques utilizing photopolymerization compounds 

and techniques are used In a preferred embodiment, the nucleic acids can be synthesized in 

situ, using known photohthographic techniques, such as those described in WO 95/251 16; 

WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references cited within, all of 
30 which are expressly incorporated by refermce; these methods of attachment form the basis of 

the Affymetrix GeneChip™ technology. 

Often, amplification-based assays are p^ormed to measure the expression level of 

lung cancer-associated sequences. These assays are typically performed in conjunction with 
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reverse transcription. In such assays, a lung canc^-associated nucleic acid sequence acts as a 
template in an amplification reaction (e.g.. Polymerase Chain Reaction, or PGR). In a 
quantitative amplification, the amount of amplification product will be proportional to the 
amount of t^nplate in the original sample. Comparison to appropriate controls provides a 
5 measure of the amount of lung cancer-associated RNA. Methods of quantitative 

amplification are well known to those of skill in the art. Detailed protocols for quantitative 
PGR are provided, e.g., in Innis, et al. (1990) PGR Protocols, A Guide to Methods and 
Apphcations . 

In some embodiments, a TaqMan based assay is used to measure expression. 

10 TaqMan based assays use a fluorogenic oligonucleotide probe tbat contains a 5' fluorescent 
dye and a 3' quenching agent. The probe hybridizes to a PGR product, but cannot itself be 
extended due to a blocking agent at the 3 ' end. When the PGR product is amplified m 
subsequent cycles, tile S' nuclease activity of the polym»:ase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage sq)arates fiie 5' fluorescent dye and the 3' 

1 5 quenching agent, thereby resulting in an increase in fluorescence as a fimction of 

amplification (see, e.g., literature provided by Peridn-Elmer, e.g., www2.perkin-ehner.com). 

Other suitable amplification methods include, but are not limited to, ligase chain 
reaction (LCR) (see Wu and Wallace (1989) Genomics 4:560, Landegren, et aL (1988) 
Science 241:1077, and Barringer, et al. (1990) Gene 89:117), transoiption anq)lification 

20 (Kwoh, et al. (1989) Proc. Natl. Acad. Sci. USA 86:1 173), self-sustained sequence 

replication (Guatelli, et al. (1990) Proc. Nat. Acad. Sci. USA 87:1874), dot PGR, and linker 
ad^ter PGR, etc. 

Expression of lung cancer proteins from nucleic acids 

25 In a preferred embodiment, lung cancer nucleic acids, e.g., encoding lung cancer 

proteins, are used to make a variety of expression vectors to express lung cancer proteins 
which can then be used in screening assays, as described below. Expression vectors and 
recombinant DNA technology are well known to those of skill in the art (see, e.g., Ausubel, 
supra^ and Fernandez and HoefiQer (eds 1999) Gene Expression Svstems) and are used to 

30 express proteins. The expression vectors may be either self-repUcating extrachromosomal 
vectors or vectors which integrate into a host genome. Generally, these expression vectors 
include transcriptional and translational regulatory nucleic acid operably linked to the nucleic 
acid encoding the lung cancer protein. The temi "control sequences'" refers to DNA 
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sequences used for the expression of an operably linked coding sequence in a particular host 
organism. Control sequences that are suitable for piokaryotes, e.g., include a promoter, 
optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are known to 
utilize promoters, polyadenylation signals, and enhancers. 
5 Nucleic acid is "operably linked" when it is placed into a functional relationship with 

another nucleic acid sequence. For example, DNA for a presequence or secretory leader is 
operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in 
the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding 
sequence if it affects the transcription of the sequence; or a ribosome binding site is operably 

10 linked to a coding sequence if it is positioned so as to feciUtate translation. Generally, 
"operably linked" means that the DNA sequences being linked are contiguous, and, in the 
case of a secretory leader, contiguous and in reading phase. However, enhancers do not have 
to be contiguous. Linking is typically accon^lished by ligation at convenient restriction 
sites. If such sites do not exist, synthetic oligonucleotide ad^tors or linkers are used in 

1 S accordance with convaitional practice. Transcriptional and translational regulatory nucleic 
acid will generally be ^propriate to the host ceU used to express the lung cancer protein. 
Numerous types of appropriate expression vectors, and suitable regulatory sequences are 
known in the art for a variety of host cells. 

In general, transcriptional and translational regulatory sequences may include, but are 

20 not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop 
sequences, translational start and stop sequences, and enhancer or activator sequences. In a 
preferred embodiment, the regulatory sequences include a promoter and transcriptional start 
and stop sequences. 

Promoter sequences may be either constitutive or inducible promoters. The promoters 
25 may be either naturally occurring promoters or hybrid promoters. Hybrid promoters, which 
combine elements of more than one promoter, are also known in the art, and are useful in the 
present invention. 

In addition, an expression vector may comprise additional elements. For example, the 
expression vector may have two replication systems, thus allowing it to be maintained in two 
30 organisms, e.g., in mammalian or insect cells for expression and in a prokaryotic host for 
cloning and amplification. Furthermore, for integrating expression vectors, the egression 
vector often contains at least one sequence homologous to the host cell genome, and 
preferably two homologous sequences which flank the expression construct The integrating 
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vector may be directed to a specific locus in flie host cell by selecting the ^)propriate 
homologous sequence for inclusion in the vector. Constructs for integrating vectors are well 
known in the art (e.g., Fernandez and Hoeffler, supra). 

In addition, in a preferred CTibodiment, the expression vector contains a selectable 
5 marker gene to allow the selection of transformed host cells. Selection genes are well known 
in the art and will vary with the host cell used 

The lung cancer proteins of the present invention are usually produced by culturing a 
host cell transformed with an expression vector containing nucleic acid encoding a lung 
cancer protein, under the appropriate conditions to induce or cause expression of the lung 

1 0 cancer protein. Conditions appropriate for lung cancer protein expression will vary with the 
choice of the expression vector and the host cell, and will be easily ascertained by one skilled 
in the art through routine e3q)erimentation or optiim2;ation. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proliferation of the host cell, ^^lile the use of an inducible promoter reqiiires the appropriate 

15 growth conditions for induction. In addition, in some embodiments, the timing of the harvest 
is important For example, the baculoviral systems used in insect cell egression are ly^o 
viruses, and thus harvest time selection can be crucial for product yield 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect and 
animal cells, including mammalian cells. Of particular interest are Saccharomyces cerevisiae 

20 and other yeasts, E. coli. Bacillus subtilis, S© cells, C129 cells, 293 cells, Neurospora, BHK, 
CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial cells), THPl cells (a 
macrophage cell line) and various other human cells and cell lines. 

In a preferred embodimmt, the lung cancer proteins are expressed in mammalian 
cells. Mammalian expression systems are also known in the art, and include retroviral and 

25 adenoviral systems. Of particular use as mammaUan promoters are the promoters from 
mammalian viral genes, since the viral genes are often highly e3q)ressed and have a broad 
host range. Examples include the SV40 early promoter, mouse mammary tumor virus LTR 
promoter, adenovirus major late promoter, herpes simplex virus promoter, and the CMV 
promoter (see, e.g., Fernandez and Hoeffler, supra). Typically, transcription tmnination and 

30 polyadenylation sequences recognized by mammalian cells are regulatory regions located 3 ' 
to the translation stop codon and thus, together with the promoter elements, flank the coding 
sequCTce. Examples of transoiption terminator and polyadenylation signals include those 
derived form SV40. 
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The mefliods of introducing exogenous nucleic acid into mammalian hosts, as well as 
other hosts, is well known in the art, and will vary with the host cell used. Techniques 
include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated 
transfection, protoplast fusion, electioporation, viral infection, encapsulation of the 
5 polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei. 

In a preferred embodiment, lung cancer proteins are expressed in bacterial systems. 
Promoters &om bacteriophage may also be used and are known in the art. In addition, 
synthetic promoters and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of 
the tip and lac promoter sequences. Furthermore, a bacterial promoter can include naturally 

10 occurring promoters of non-bacterial origin that have the ability to bind bacterial RNA 
polymerase and initiate transcription. In addition to a functioning promoter sequence, an 
efficient ribosome bindkig site is desirable. The expression vector may also include a signal 
peptide sequence that provides for secretion of the lung cancer protein in bacteria. The 
protein is either secreted into the growth media (gram-positive bacteria) or into the 

15 periplasmic space, located between the inner and outer membrane of the cell (gram-negative 
bacteria). The bacterial expression vector may also include a selectable marker gene to allow 
for the selection of bacterial strains that have been transformed. Suitable selection genes 
include genes which render the bacteria resistant to drugs such as ampicillin, 
chloramphenicol, erythromycin, kanamycin, neomycin and tetracycline. Selectable markers 

10 also mclude biosynthetic genes, such as those in the histidine, tryptophan and leucine 

biosyntfaetic pathways. These components are assembled into expression vectors. Egression 
vectors for bacteria are well known in the art, and include vectors for Bacillus subtilis, E. 
colU Streptococcus cremoriSj and Streptococcus lividans^ among others (e.g., Fernandez and 
HoefO^, supra). The bacterial expression vectors are transformed into bacterial host cells 

i5 using techniques well known in the art, such as calcium chloride treatment, electroporation, 
and others. 

In one embodiment, lung cancer proteins are produced in insect cells. Expression 
vectors for the transformation of insect cells, and in particular, baculovirus-based expression 
vectors, are well known in the art 
\Q In a preferred embodiment, lung cancer protein is produced in yeast cells. Yeast 

expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae^ Candida albicans and C. maltosa^ Hansenula polymorpha. 
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KLuyveromyces fragilis and K, lactis, Pichia guillerimondih and P. pastoris, 

Schizosaccharomyces pombe, and Yarrofwia lipofytica. 

The lung cancer protein may also be made as a fusion protein, using techniques well 

known in the art. Thus, e.g., for the creation of monoclonal antibodies, if flie desired epitope 

S is small, the lung cancer protein may be fiised to a carrier protein to form an immimogen. 

Alternatively, the limg cancer protein may be made as a fusion protein to increase expression 

for affinity purification purposes, or for other reasons. For example, when the limg cancer 

protein is a lung cancCT peptide, the nucleic acid encoding the peptide may be linked to other 

nucleic acid for expression purposes. 

10 In a preferred embodiment, the lung cancer protein is purified or isolated after 

expressioiL Lung cancer proteins may be isolated or purified in a variety of appropriate 
ways. Standard purification methods include electrophoretic,.molecular, immunological and 
chromatographic techniques, including ion exchange, hydrophobic, afBnity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the lung cancer protein 

IS may be pmified using a standard anti-lung cancer protein antibody colunuL Ultrafiltration 
and diafiltration techniques, in conjunction with protein concentration, are also useful. For 
general guidance in suitable purification techniques, see Scopes (1982) Protem Purification. 
The degree of purification necessary will vary depending on the use of the lung cancer 
protein. In some instances no purification will be necessary. 

20 Once expressed and purified if necessary, the lung cancer proteins and nucleic acids 

are useful in a mmiber of ^plications. They may be used as immunoselection reagents, as 
vaccine reagents, as screening agents, therapeutic entities, for production of antibodies, as 
transcription or translation inhibitors, etc. 

25 Variants of lung cancer proteins 

In one embodiment, the lung cancer proteins are derivative or variant lung cancer 
proteins as compared to the wild-type sequence. That is, as outlined more fully below, the 
derivative lung cancer peptide will often contain at least one amino acid substitution, deletion 
or insertion, with amino acid substitutions being particularly preferred. The amino acid 
30 substitution, insertion or deletion may occur at a particular residue within the lung cancer 
peptide. 

Also included within one embodiment of lung cancer proteins of the presCTt invention 
are amino acid sequence variants. These variants typically fall into one or more of three 
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classes: substitutional^ insertional or deletional variants. These variants ordinarily are 
prepared by site specific mutagenesis of nucleotides in the DNA encoding the lung cancer 
protein, using cassette or PGR mutagenesis or other techniques, to produce DNA encoding 
&e variant, and thereafter expressing the DNA in recombinant cell culture as outlined above. 
5 However, variant lung canc^ protem fragments having up to about 100-150 residues may be 
prepared by in vitro synthesis. Ainino acid sequence variants are characterized by the 
predetermined nature of the variation, a feature that sets them apart from naturally occurring 
allelic or interspecies variation of the lung cancer protein ainino acid sequence. The variants 
typically exhibit a similar qualitative biological activity as the naturally occurring analogue, 

10 although variants can also be selected which have modified characteristics as will be more 
fiiUy outlined below. 

While the site or region for introducing an amino acid sequence variation is often 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 

1 S conducted at the target codon or region and the e}q)ress6d lung cancer variants screened for 
the optimal combination of desired activity. Techniques exist for making substitution 
mutations at predetermined sites in DNA having a known sequence, e.g., Ml 3 primer 
mutagenesis and PCR mutagenesis. Screening of mutants is often done using assays of lung 
cancer protein activities. 

20 Amino acid substitutions are typically of single residues; insertions usually will be on 

the ord^ of from about 1 to 20 amino acids, altiiough considerably larger insertions may be 
occasionally tolerated. Deletions generally range from about 1 to about 20 residues, although 
in some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to arrive 

25 at a final d^vative. Generally these changes are done on a few amino acids to minimize the 
alteration of the molecule. Larger changes may be tolerated in certain circumstances. When 
small alterations in the characteristics of a lung cancer protein are desired, substitutions are 
generally made in accordance with the amino acid substitution chart provided in the 
definition section. 

30 Variants typically exhibit essentially tiie same quaUtative biological activity and will 

elicit the same immune response as a naturally-occuiring analog, although variants also are 
selected to modify the characteristics of lung cancer proteins as needed. Alternatively, the 
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variant may be designed or reorganized such that a biological activity of the lung cancer 
protein is altered For sample, glycosylation ^tes may be added, altered, or removed. 

Covalent modifications of lung cancer polypeptides are included witiiin the scope of 
this invention. One type of covalent modification includes reacting targeted amino acid 
S residues of a lung cancer polypeptide with an organic derivatizing agent that is capable of 
reacting with selected side chains or the N-or C-terminal residues of a lung cancer 
polypeptide. Derivatization with bifunctional agents is useful, for instance, for crosslinking 
lung cancer polypeptides to a water-insoluble support matrix or sxuface for use in a method 
for purifying anti-lung cancer polypeptide antibodies or screening assays, as is more fully 

10 described below. Commonly used crosslinking agents include, e.g., l,l-bis(diazoacetyl)-2- 
phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., estCTS with 4-azidosalicylic 
acid, homobifimctional imidoesters, including disuccinimidyl esters such as 3,3'- 
dithiobis(succinimidylpropionate), bifunctional maleimides such as bis-N-maleimido-1,8- 
octane and agents such as methyl-3-((p-azidophenyl)dithio)propipimidate. 

IS Other modifications include deamidation of glutaminyl and asparagmyl residues to 

the corresponding glutamyl and aspartyl residues, respectively, hydrpxylation of proline and 
lysine, phosphorylation of hydroxyl groups of serinyl, flireonyl or tyrosyl residues, 
methylation of the y-amino groups of lysine, argioine, and histidine side chains (Creighton 
(1983) proteins: Structure and Mqlecular Piroperties, pp. 79-86), acetylation of the N-tenninal 

20 amine, and amidation of any C-tenninal carboxyl group. 

Anoth^ type of covalent modification of the lung cancer polypeptide encompassed by 
this invention is an altered native glycosylation pattern of the polypeptide. "Altering the 
native glycosylation pattern" is intended herein to mean adding to or deleting one or more 
carbohydrate moieties of a native sequence lung cancer polypeptide. Glycosylation patterns 

25 can be altered in many ways. For example the use of different cell types to express lung 
cancer-associated sequences can result in different glycosylation patterns. 

Addition of glycosylation sites to lung cancer polypeptides may also be accompUshed 
by altering the amino acid sequence hereof. The alteration may be made, e.g., by the 
addition o^ or substitution by, one or more serine or threonine residues to the native sequence 

30 lung cancer polypeptide (for 0-linked glycosylation sites). The lung cancer amino acid 
sequence may optionally be altered through changes at tfie DNA level, particularly by 
mutating the DNA encoding the lung cancer polypeptide at preselected bases such that 
codons are generated that will translate into the desired amino acids. 
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Another means of increasing the number of caibohydrate moieties on the lung cancer 
polypeptide is by chemical or enzymatic coupling of glycosides to the polypeptide. Such 
methods are described in the art, e.g,, in WO 87/05330, and in Aplin and Wriston (1981) 
CRC Crit Rev. Biochem,. pp. 259-306. 
5 Removal of carbohydrate moieties present on flie lung cancer polypeptide may be 

accomplished chemically or enzymatically or by mutational substitution of codons encoding 
for amino acid residues that serve as targets for glycosylation. Qiemical deglycosylation 
techniques are known in the art and described, for instance, by Hakimuddin, et al. (1987) 
Arch. Biochem. Biophvs.> 259:52 and by Edge, et al. (1981) Anal. Biochem., 1 18:131. 
10 Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the use of a 
variety of endo-and exo-glycosidases as described by Thotakura, et al. (1987) Meth. 
EnzvmoL, 138:350. 

Another type of covalent modification of lung cancer comprises linking the lung 
cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., polyethylene 

1 5 glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in U.S. Patent 
Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192, or 4,179,337. 

Lung cancer polypeptides of the present invention may also be modified in a way to 
form chimeric molecules comprising a lung cancer polypeptide fiised to another, 
heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 

20 molecule comprises a fiision of a lung cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The q)itope tag is 
generally placed at the amino-or caiboxyl-tenninus of the lung cancer polypeptide. The 
presence of such epitope-tagged forms of a lung cancer polypq)tide can be detected using an 
antibody against the tag polypeptide. Also, provision of the epitope tag enables the lung 

25 cancer polypeptide to be readily purified by afSnity purification using an anti-tag antibody or 
another type of afiBnity matrix that binds to the epitope tag. In an alternative embodiment, 
the chimeric molecule may comprise a fusion of a lung cancer polypeptide with an 
immunoglobulin or a particular region of an immimoglobulin. For a bivalent form of the 
chimeric molecule, such a fusion could be to the Fc region of an IgG molecule, 

30 Various tag polypeptides and their respective antibodies are well known and examples 

include poly-histidine (poly*his) or poly-histidine-glycine (poly-his-gly) tags; HIS6 and metal 
chelation tags, tiie flu HA tag polypeptide and its antibody 12CA5 (Field, et al. (1988) Mol 
CeU. Biol. 8:2159-2165); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 9E10 antibodies 
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thereto (Evan, et al. (1985) Molecular and Cellular Biology 5:3610-3616); and the Herpes 
Simplex virus glycoprotem D (gD) tag and its antibody (Paborsky, et al. (1990) Protein 
Engineering 3(6):547-553). Other tag polypeptides include the Flag-peptide (Hopp, et al. 
(1988) BioTecbnology 6:1204-1210); the KT3 epitope peptide (Martin, et al. (1992) Science 
5 255:192-194); tubulin epitope peptide (Skinner, et al. (1991) J. BioL Chem. 266:15163- 
15166); and the T7 gene 10 protein peptide tag (Lutz-Freyermuth, et al. (1990) Proc. Nat'l 
Acad. Sci, USA 87:6393-6397). 

Also included are other lung cancer proteins of the lung cancer family, and lung 
cancer proteins from other organisms, which are cloned and expressed as outlined below. 

10 Thus, probe or degenerate polymerase chain reaction (PGR) primer sequences may be used to 
find other related lung cancer proteins from primates or other organisms. As will be 
appreciated by those in the art, particularly useful probe and/or PGR primer sequences 
include unique areas of the lung cancer nucleic acid sequence. As is generally known in the 
art, preferred PGR primers are Scorn about 15 to about 35 nucleotides in length, with from 

15 about 20 to about 30 being preferred, and may contam inosine as needed PGR reaction 
conditions are well known in the art (e.g., Imiis, PGR Protocols, supra). 

Antibodies to lung cancer proteins 

In a preferred embodiment, when a lung cancCT protein is to be used to generate 

20 antibodies, e.g., for immunotherapy or immunodiagnosis, the lung cancer protein should 
share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHG. Thus, in most instances, antibodies 
made to a smaller lung cancer protein will be able to bind to the full-length protein, 

25 particularly linear epitopes. In a preferred embodiment, the epitope is imique; that is, 
antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are well known (e.g., Goligan, supra; and 
Harlow and Lane, supra). Polyclonal antibodies can be raised in a mammal, e.g., by one or 
more injections of an immunizing agent and, if desired, an adjuvant Typically, the 

30 immunizing agent and/or adjuvant will be injected in the mammal by multiple subcutaneous 
or intr^eritoneal injections. The immunizing agent may include a protein encoded by a 
nucleic add of Tables IA-16 or fragment thereof or a fusion protein thereof. It may be useful 
to conjugate the unmunizmg agent to a protein known to be immunogenic in tiie mammal 
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being immunized. Immunogenic proteins include, e.g., keyhole limpet hemocyanin, serum 
albumin, bovine tiiyroglobulin, and soybean trypsin inhibitor. Adjuvants include, e.g., 
Freimd's complete adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic 
trehalose dicorynomycolate). The immunization protocol may be selected by one skilled m 
5 the art. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal antibodies 
may be prepared using hybridoma methods, such as those described by Kohler and Milstein 
(1975) Nature 256:495. In a hybridoma method, a mouse, hamster, or other appropriate host 
animal, is typically immunized with an immunizing agent to elicit lymphocytes that produce 
10 or are capable of producing antibodies that will specifically bind to the immunizing agent. 
Alternatively, the lynq}hocytes may be immunized in vitro. The immunizing agent will 
typically include a polypeptide encoded by a nucleic acid of the tables, or fiagment thereoJC 
or a fusion protein thereof. Geaerally, either peripheral blood lymphocytes (*TBLs") are 
used if cells of himian origin are desired, or spleen cells or lymph node cells are used if non- 
15 human mammalian sources are desired. The lymphocytes are then fused with an 

immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a 
hybridoma cell (Coding (1986) Monoclonal Antibodies: Principles and Practice, pp. 59-103 ). 
Immortalized cell lines are usually transformed mammalian cells, particularly myeloma cells 
of rodent, bovin, or primate origin. Usually, rat or mouse myeloma cell lines are employed. 
20 The hybridoma cells may be cultured in a suitable culture medium that preferably contains 
one or more substances that inhibit the growth or survival of the unfiised, immortalized cells. 
For example, if the paraital cells lack the enzyme hypoxantfaine guanine phosphoribosyl 
transferase (HGPRT or HPRT), the culture medium for the hybridomas typically will mclude 
hypoxanthine, aminopterin, and thymidine ("HAT mediiun"), which substances prevent the 
25 growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies. Bispecific antibodies are 
typically monoclonal, preferably human or humanized, antibodies that have binding 
specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 
30 protein encoded by a nucleic acid of the tables or a fiagment thereof; the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 
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In a preferred embodiment, the antibodies to lung cancer protein are enable of 
reducing or eliminating a biological function of a lung cancer protein, in a naked form or 
conjugated to an effector moiety. That is, the addition of anti-lung cancer protein antibodies 
(either polyclonal or preferably monoclonal) to lung cancer tissue (or cells containing lung 
S cancer) may reduce or eliminate tiie lung cancer. Generally, at least a 25% decrease in 
activity, growth, size or the like is preferred, with at least about 50% being particularly 
preferred and about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the lung cancer proteins are humanized 
antibodies (e.g., Xenerex Biosciraces, Medarex, Inc., Abgenix, Inc., Protein Design Labs, 

10 Inc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric molecules of 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab*, 
F(ab')2 or other antigen-binding subsequences of antibodies) which contain minimal 
sequence derived fix)m non-human immunoglobulin. Humanized antibodies include human 
immunoglobulins (recipient antibody) in which residues from a complementary determining 

1 5 region (Q)R) of the recipient are replaced by residues from a CDR of a non-human species 
(donor antibody) such as mouse, rat or rabbit having the desired specificity, afgnity and 
capacity. In some instances, Fv framework residues of a human immunoglobulin are 
replaced by corTeq>onding non-human residues. Humanized antibodies may also comprise 
residues which are found neither in the recipient antibody nor isi the imported CDR or 

20 framework sequences. In g^eral, a humanized antibody will comprise substantially all of at 
least one, and typically two, variable domains, in which all or substantially all of the CDR 
regions correspond to those of a non-human iiomunoglobulin and all or substantially all of 
the framework (FR) regions are those of a human inununoglobulin consensus sequence. A 
humanized antibody optimally also will typically comprise at least a portion of an 

25 immunoglobulin constant region (Fc), typically that of a human immunoglobulin (Jones, et 
al. (1986) Nature 321 :522-525; Riechmann, et al. (1988) Nature 332:323-329; and Presta 
(1992) Cuir. Op. Struct. Biol 2:593-596). Humanization can be performed following the 
method of Winter and co-workers (Jones, et al. (1986) Nature 321:522-525; Riechmann, et al. 
(1988) Nature 332:323-327; Verhoeyen, et al. (1988) Science 239:1534-1536), by 

30 substituting rodent CDRs or CDR sequences for corresponding sequaices of a human 

antibody. Accordingly, such humanized antibodies are chimeric antibodies (U.S. Patent No. 
4,816,567), wherein substantially less than an intact human variable domain has been 
substituted by corresponding sequence from a non-human species. 
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Human-like antibodies can also be produced using various techniques known in the 
art, including phage display libraries (Hoogenboom and Winter (1991) L Mol BioL 227:381; 
Marks, et al. (1991) L MoL BioL 222:581). The techniques of Cole, et aL and Boemer, et al. 
are also available for the preparation of human monoclonal antibodies (Cole, et al. (1985) 
5 Monoclonal Antibodies and Cancer Therapv> p. 77 and Boemer, et al. (1991) J TmmuTiol. 
147(l):86-95). Similarly, human antibodies can be made by introducing human 
immunoglobulin loci into transgraic animals, e.g., mice in which the endogenous 
immxmoglobulin genes have been partially or completely inactivated. Upon challenge, 
human antibody production is observed, which closely resembles that seen in humans in 

10 nearly all respects, including gene rearrangement, assembly, and antibody repertoire. This 
approach is described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in the following scientific pubUcations: Marks, et al. (1992) 
Bio/Technologv 10:779-783; Lonberg, et al. (1994) Nature 368:856-859; Morrison (1994) 
Nature 368:812-13; Fishwild, et al. (1996) Nature Biotechnoloev 14:845-51; Neuberger 

15 (1996) Nature Biotechnolopv 14:826; and Lonberg and Huszar (1995) Ihtem. Rev. Tmiminol. 
13:65-93. 

By immunoflien^y is meant treatment of lung cancer with an antibody raised against 
a lung cancer proteins. As used herein, immunotherapy can be passive or active. Passive 
immunotherapy as defined herein is the passive transfer of antibody to a recipient (patient). 

20 Active immunization is the induction of antibody and/or T-cell responses in a recipient 
(patient). Induction of an immune response is the result of providing the recipimt with an 
antigen to which antibodies are raised. The antigen may be provided by injecting a 
polypeptide against which antibodies are desired to be raised into a recipient, or contacting 
the recipient with a nucleic acid capable of expressing the antigen and under conditions for 

25 expression of the antigen, leading to an immune response. 

In a preferred embodiment the lung cancer proteins against which antibodies are 
raised are secreted proteins as described above. Without being boimd by theory, antibodies 
used for treatment, may bind and prevCTit the secreted protein firom binding to its receptor, 
thereby inactivating the secreted limg cancer protein. 

30 In another preferred embodiment, the lung cancer protein to which antibodies are 

raised is a transmembrane protein. Without being bound by theory, antibodies used for 
treatment may bind the extracellular domain of the lung cancer protein and prevent it firom 
binding to ottier protems, such as circulating ligands or cell-associated molecules. The 
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antibody may cause down-reguladon of the transmmibrane lung cancer piotein. The 

antibody may be a conq)etitive, non-competitive or uncompetitive inhibitor of protein binding 

4 

to the extracellular domain of the lung cancer protein. Hie antibody may be an antagonist of 
the lung cancer protein or may prevent activation of a transmembrane lung cancer protem^ or 
S may induce or suppress a particular cellular pathway. In some embodiments, when Hhe 
antibody prevents the binding of other molecules to the lung cancer protein, the antibody 
prevents growth of the cell. The antibody may also be used to target or sensitize the cell to 
cytotoxic agents, including, but not limited to TNF-a, TNF-P, IL-1, INF-y, and IL-2, or 
chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, methotrexate, 

10 and the like. In some instances the antibody may belong to a sub-type that activates serum 
complement when complexed with the transmembrane protein thereby mediating cytotoxicity 
or antigen-dependent cytotoxicity (ADCC). Thus, lung cancer may be treated by 
administering to a patient antibodies directed against flie transmembrane lung cancer protein. 
Antibody-labeling may activate a co-toxin, localize a toxin payload, or otiierwise provide 

1 S means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector moiety. 
The effector moiety can be various molecules, including labeling moieties such as radioactive 
labels or fluorescent labels, or can be a ther^eutic moiety. In one aspect the therapeutic 
moiety is a small molecule tiiat modulates the activity of a lung cancer proteirt la anoAer 

20 aspect the therapeutic moiety may modulate an activity of molecules associated with or in 
close proximity to a lung cancer protein. The therapeutic moiety may inhibit enzymatic or 
signaling activity such as protease or collagenase activity associated with lung cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic agent. In 
this method, targeting the cytotoxic agent to lung cancer tissue or cells results in a reduction 

25 in the munber of afflicted cells, thereby reducing symptoms associated with lung cancer. 

Cytotoxic agents are numerous and varied and include, but are not limited to, cytotoxic drugs 
or toxins or active fragments of such toxins. Suitable toxins and their corresponding 
fragments include diphth^a A chain, exotoxin A chain, ricin A chain, abrin A chain, curcin, 
crotin, phenomycin, enomycin, saporin, auristatin, and the like. Cytotoxic agents also include 

30 radiochemicals made by conjugating radioisotopes to antibodies raised against lung cancer 
proteins, or binding of a radionuclide to a chelating agent that has been covalently attached to. 
the antibody. Targeting the therapeutic moiety to transmembrane lung cancer proteins not 
only serves to increase the local concentration of therapeutic moiety in the lung cancer 
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afOicted area, but also serves to reduce deleterious side effects that may be associated with 
die untargeted therapeutic moiety. 

In another preferred embodiment, the lung cancer protein against which the antibodies 
are raised is an intracelMar protein. Jn this case, the antibody may be conjugated to a protein- 
5 or other entity which facilitates mtry into the cell. In one case, the antibody enters flie cell by 
endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 
the individual or cell. Moreover, wherein the lung cancer protein can be targeted within a 
cell, i.e., the nucleus, an antibody theretomay contain a signal for that target localization, i.e., 
a nuclear localization signal. 
10 The Ixmg cancer antibodies of the invention specifically bind to lung cancer proteins. 

By "specifically bind" herein is meant that the antibodies bind to the protein with a Kd of at 
least about 0.1 mM, more usually at least about 1 ^M, preferably at least about 0.1 |jM or 
better, and most preferably, 0.01 or better. Selectivity of binding to the specific tai;get 
and not to related other sequences is also in^ortant. 

15 

Detection of lung cancer sequence for diagnostic and therapeutic applications 

In one aspect, the RNA egression levels of genes are determined for different 
cellular states in the lung cancer phenotype. E3q)ression levels of genes in normal tissue (e.g., 
not undergoing limg cancer), in lung cancer tissue (and in some cases, for varying severities 

20 of lung cancer that relate to prognosis, as outlined below), or in non-malignant disease are 
evaluated to provide expression profiiles. A gene expression profile of a particular cell state 
or point of development is essmtially a **fingeiprinf * of the state of the cell. While two states 
may have a particular gene similarly e?q)ressed, the evaluation of a number of genes 
simultaneously allows the generation of a gene expression profile that is reflective of the state 

25 of the cell. By comparing expression profiles of cells in different states, information 

regarding which genes are important (including both up- and down-regulation of genes) in 
each of these states is obtained. Then, diagnosis may be performed or confirmed to 
determine whether a tissue sample has the gene expression profile of normal or cancerous 
tissue. This v^ll provide for molecular diagnosis of related conditions. 

30 'T)ifferential expression," or grammatical equivalents as used herein, refers to 

qualitative or quantitative differences in the temporal and/or cellular gene e?q)ression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 
qualitatively have its expression altered, including an activation or inactivation, in, e.g., 
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noimal versus lung cancer tissue. Genes maybe turned on or turned off in aparticular state, 
relative to another state thus permitting comparison of two or more states. A qualitatively 
regulated geme will exhibit an expression pattern within a state or cell type which is 
detectable by standard techniques. Some genes will be expressed in one state or cell type, but 
5 not in bo&. Alternatively, the difference in expression may be quantitative, e.g., in that 
expression is increased or decreased; i.e., gene expression is either upregulated, resulting in 
an increased amount of transcript, or downregulated, resulting in a decreased amount of 
transcript. The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Affymetrix 

10 GeneChip™ expression arrays, Lockhart (1996) Nature Biotechnology 14:1675-1680, hereby 
expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PGR, northem analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is typically 
at least about 50%, more preferably at least about 100%, more preferably at least about 

15 150%, more preferably at least about 200%, with from 300 to at least 1000% being especially 
preferred. 

Evaluation may be at the gene transcript or the protein level. The amount of gene 
expression may be monitored using nucleic acid probes to the RNA or DNA equivalent of the 
gene transcript, and the quantification of gene expression levels, or, alternatively, the final 

20 gene product itself (protein) can be monitored, e.g., with antibodies to the lung cancer protein 
and standard immunoassays (ELISAs, etc.) or other techniques, including mass spectroscopy 
assays, 2D gel electrophoresis assays, etc. Protems corresponding to lung cancer genes, e.g., 
those id^tified as being unportant in a lung cancer or disease phenotype, can be evaluated in 
a lung cancer diagnostic test In a preferred embodiment, gene expression monitoring is 

25 performed simultaneously on a number of genes. 

The lung cancer nucleic acid probes may be attached to biochips as outlined herein for 
the detection and quantification of lung cancer sequences in a particular cell. The assays are 
further described below in the example. PGR techniques can be used to provide greater 
sensitivity. Multiple protein expression monitoring can be performed as well. Similarly, 

30 these assays may be performed on an individual basis as well. 

In a preferred embodiment nucleic acids encoding the limg cancer protein are 
detected. Although DNA or RNA encoding the lung cancer protein may be detected, of 
particular interest are methods wherein an mRNA encoding a lung cancer protein is detected. 
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Probes to detect mRNA can be a nucleotide/deoxynucleotide probe fliat is complCTientary to 
and hybridizes with the mSNA and includes, but is not limited to, oligonucleotides, cDNA or 
KNA. Probes also should contain a detectable label, as defined herein. In one method the 
mRNA is detected after immobilizing tiie nucleic acid to be examined on a solid support such 
5 as nylon membranes and hybridizmg the probe with ttie sample. Following washing to 
rCTiove the non-specifically bound probe, the label is detected In another method detection 
of the mRNA is performed in situ. In this method permeabilized cells or tissue samples are 
contacted with a detectably labeled nucleic acid probe for sufficient time to allow the probe 
to hybridize with the target mRNA. Following washing to remove the non-specifically bound 

10 probe, the label is detected. For example a digoxygenin labeled riboprobe (RNA probe) that 
is complementary to the mRNA encoding a lung cancer protein is detected by binding the 
digoxygenin with an anti-digoxygenin secondary antibody and developed with nitro blue 
tetrazolium and 5-bromo-4-chloro-3-indoyl phosphate. 

In a preferred embodiment, various proteijos fix>m the three classes of proteins as 

1 S described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
assays. The lung cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing limg cancer sequences are used in diagnostic assays. This can be performed on an 
individual gsae or corresponding polypeptide level In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 

20 techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

As described and defined herein, lung canc^ proteins, including intracellular, 
transmembrane, or secreted proteins, find use as markers of lung cancer, e.g., for prognostic 
or diagnostic purposes. Detection of these proteins in putative lung cancer tissue allows for 

25 detection, prognosis, or diagnosis of lung cancer or similar disease, and perhaps for selection 
of therapeutic strategy. In one embodiment, antibodies are used to detect lung cancer 
proteins. A preferred method separates proteins fix^m a sample by electrophoresis on a gel 
(typically a denaturing and reducing protein gel, but may be another type of gel, including 
isoelectric focusing gels and the like). Following separation of proteins, the lung cancer 

30 protein is detected, e.g., by immunoblotting with antibodies raised against the lung cancer 
protein. Methods of immunoblotting are well known to those of ordinary skill in the art. 

In another preferred method, antibodies to the lung cancer protem find use in in situ 
imaging techniques, e.g., in histology (e.g., Asai (ed 1993) Methods Cell Biology: 
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Antibodies in Cell Biology, volume 37. In Ms method cells are contacted wift from one to 
many antibodies to the limg cancer protein(s). Following washing to remove non-specific 
antibody binding, the presence of the antibody or antibodies is detected. In one embodiment 
the antibody is detected by incubating with a secondary antibody that contains a detectable 
5 label, e.g., multicolor fluorescence or confocal imaging. In another method the primary 

antibody to the lung cancer protein(s) contains a detectable label, e.g., an enzyme marker that 
can act on a substrate. In another preferred embodiment each one of multiple primary 
antibodies contains a distinct and detectable label. This method finds particular use in 
simultaneous screening for a plurality of lung cancer proteins. Many other histological 

10 imaging techniques are also provided by the invention. 

In a preferred embodiment the label is detected in a fluorometer which has the ability 
to detect and distinguish emissions of difiTerent wavelmgtfas. In addition, a fluorescence 
activated cell sorter (FACS) can be used in die me&od. 

In anoflier preferred embodiment, antibodies find use in diagnosing lung cancer fix)m 

IS blood, serum, plasma, stool, and other samples. Such samples, tiierefore, are usefiil as 
samples to be probed or tested for the presence of lung cancer proteins. Antibodies can be 
used to detect a lung canc^ protein by previously described immunoassay techniques 
including ELIS A, immunoblotting (western blotting), immunoprecipitation, BIACORE 
technology and the like. Conversely, the presence of antibodies may indicate an immune 

20 response against an endogenous lung cancer protein or vaccine. 

In a preferred embodiment, in situ hybridization of labeled lung cancer nucleic acid 
probes to tissue arrays is done. For example, arrays of tissue samples, including lung cancer 
tissue and/or normal tissue, are made. In situ hybridization (see, e.g., Ausubel, supra) is tiien 
performed. When comparing the fingerprints between an individual and a standard, the 

25 skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is 
further understood that the genes which indicate the diagnosis may differ 6om those which 
indicate the prognosis and molecular profiling of the condition of the cells may lead to 
distinctions between responsive or reflectory conditions or may be predictive of outcomes. 
In a preferred embodiment, the lung cancCT proteins, antibodies, nucleic acids, 

30 modified proteins and cells containing lung cancer sequences are used in prognosis assays. 
As above, gene expression profiles can be generated that correlate to lung cancer, clinical, 
pathological, or other information, in terms of long term prognosis. Again, tiiis may be done 
on either a protein or gene level, wifli the use of genes being preferred. Single or multiple 
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genes may be useful in various combinations. As above, lung cancer probes may be attached 
to biochips for the detection and quantification of lung cancer sequences in a tissue or patient. 
The assays proceed as outlined above for diagnosis. PGR method may provide more 
sensitive and accurate quantification. 

5 

Assays for therapeutic compounds 

In a preferred embodiment, the proteins, nucleic acids, and antibodies as described 
herein are used in drug screening assays. The lung cancer proteins, antibodies, nucleic acids, 
modified proteins and cells containing lung cancer sequences are used in drug screening . 

10 assays or by evaluating the effect of drug candidates on a "gene expression profile** or 
expression profile of polypeptides, hi a preferred embodiment, the expression profiles are 
used, preferably in conjunction with high througlq}ut screening techniques to allow 
monitoring for expression profile genes after treatment with a candidate agent (e.g., 
Zlokamik, et al. (1998) Science 279:84-8; Heid (1996) Genome Res, 6:986-94. 

IS In a preferred embodiment, the lung cancer proteins, antibodies, nucleic acids, 

modified proteins and cells containing the native or modified lung cancer proteuis are used in 
screening assays. That is, the present invention provides novel metiiods for soreening for 
compositions which modulate the lung cancer phenotype or an identified physiological 
fimction of a lung cancer protein. As above, this can be done on an individual gene level or 

20 by evaluating the effect of drug candidates on a ''gene e}q)ression profile*'. In a preferred 

embodiment, the expression profiles are used, preferably in conjunction with high tfaroug|lq>ut 
soreening techniques to allow monitoring for expression profile genes after treatment with a 
candidate agent, see Zlokamik, supra. 

Having identified differentially expressed genes herein, a variety of assays may be 

25 perfonned. In a preferred embodiment, assays may be run on an individual gene or protein 
level. That is, having identified a particiilar gene with altered regulation in lung cancer, test 
compounds can be screened for the abiUty to modulate gene e3q)ression or for binding to the 
lung cancer proteiiL '^Modulation" thus includes an increase or a decrease in gene 
expression. The preferred amount of modulation will depend on the original change of tiie 

30 gene expression in normal votus tissue undergoing limg cancer, with changes of at least 
10%, preferably 50%, more preferably 100-300%, and in some embodiments 300-1000% or 
greater. Thus, if a gene exhibits a 4-fold increase in lung cancer tissue compared to normal 
tissue, a decrease of about four-fold is often desired; similarly, a 10-fold decrease in lung 
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cancer tissue compared to normal tissue often provides a target value of a 10-fold increase in 
expression to be induced by the test compound. 

The amount of gene e?q)ression may be monitored iising nucleic acid probes and die 
quantification of gene expression levels, or, altematively, ttie gene product itself can be 
S monitored, e.g., through the use of antibodies to the lung cancer protein and standard 
iromunoassays. Proteomics and separation techniques may also allow quantification of 
expression. 

In a preferred embodiment, gene or protein expression monitoring of a number of 
entities, i.e., an expression profile, is monitored simultaneously. Such profiles will typically 
1 0 involve a plurality of those entities described herein. 

In this embodiment, the lung cancer nucleic acid probes are attached to biochips as 
outlined herein for the detection and quantification of lung cancer sequences in a particular 
cell. Altematively, PGR may be used. Thus, a series, e.g., ofmicrotiter plate, may be used 
with dispensed prim^ in desired wells. A PGR reaction can then be perfonned and analyzed 
IS for each well. 

Expression monitoring can be performed to identify compounds that modify the 
expression of one or more lung cancer-associated sequences, e.g., a polynucleotide sequence 
set out in the tables. Generally, in a preferred embodiment, a test compound is added to the 
cells prior to analysis. Moreover, screens are also provided to identify agents that modulate 

20 lung cancer, modulate limg cancer proteins, bind to a lung cancer protein, or interfere with 
the binding of a lung cancer protem and an antibody, substrate, or other binding partus. 

The term '*test compound" or "drug candidate" or **modulator" or grammatical 
equivalents as used herein describes a molecule, e.g., protein, oUgopeptide, small organic 
molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or. 

25 indirectly alter the lung cancer phenotype or the expression of a lung cancer sequence, e.g., a 
nucleic acid or protein sequence. In preferred embodiments, modulators alter expression 
profiles of nucleic acids or proteins provided herein. In one embodiment, the modulator 
suppresses a lung cancer phenot3q)e, e.g., to a normal or non-mahgnant tissue fingerprint. In 
another embodiment, a modulator induces a lung cancer phenotype. Generally, a plurality of 

30 assay mixtures are run in parallel with different agent concentrations to obtain a differential 
'response to the various conc^trations. Typically, one of these concentrations serves as a 
negative control, i.e., at zero concentration or below the level of detection. 
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In one aspect, a modulator will neutralize ttie eJGTect of a lung cancer protein. By 
"'neutralize" is meant that activity of a protein and the consequent effect on the cell is 
inhibited or blocked. 

In certain embodimmts, combinatorial libraries of potential modulators will be 
S screened for an ability to bind to a lung cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
and activity of those variant compounds. Often, high throughput screening (HTS) methods 

10 are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve providing a 
Ubrary containing a large number of potential therapeutic compoimds (candidate 
compounds). Such "combinatorial chemical Ubraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 

IS display a desured characteristic activity. The compounds thus identified can serve as 

conventional 'lead compounds" or can themselves be used as potential or actual therapeutics. 

A combinatorial chemical library is a collection of diverse chemical compoimds 
generated by either chemical syntfiesis or biological synthesis by combining a number of 
chemical '"building blocks" such as reagents. For exanople, a linear combinatorial chemical 

iO Ubrary, such as a polypeptide (e.g., mutein) library, is formed by combining a set of chemical 
building blocks called amino acids in every possible way for a given compound length (i.e., 
the number of amino acids in a polypeptide compound). Millions of chemical compounds 
can be synthesized through such combinatorial mixing of chraiical building blocks (Gallop, 
et al. (1994) J. Med. Chem. 37(9): 1233-1251). 

15 Preparation and screening of combinatorial chemical hTjraries is well known to those 

of skill in the art Such combinatorial chemical Ubraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent No. 5,010,175, Furka (1991) Pept Prot Res. 37:487- 
493, Houghton, et al. (1991) Nature, 354:84-88), peptoids (PCT PubUcation No WO 
91/19735), encoded peptides (PCT PubUcation WO 93/20242), random bio-oUgomers (PCT 

10 Publication WO 92/00091), benzodiazepines (U.S. Pat No. 5,288,514), diversomers such as 
hydantoins, benzodiazepines and dipeptides (Hobbs, et al. (1993) Proc. Nat. Acad. Sci. USA 
90:6909-6913), vinylogous polypeptides (Hagihara, et al. (1992) J. Amer. Chem. Soc. 
1 14:6568), noi^eptidal peptidomimetics with a Beta-D-Glucose scaffolding (Hirschmann, et 
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al. (1992) J. Amer. Chem. Soc. 114:9217-9218), analogous organic syntheses of small 
compound libraries (Chen, et aL (1994) J. Amer. Chem. Soc. 116:2661), oligocaibamates 
(Cho, et al. (1993) Science 261:1303), and/or peptidyl phosphonates (Campbell, et al. (1994) 
J. Org. Chem. 59:658). See, generally, Gordon, et al. (1994) L Med Chem. 37:1385, nucleic 
5 acid libraries (see, e.g., Stratagene, Corp.), peptide nucleic acid libraries (see, e.g., U.S. 
Patent 5,539,083), antibody libraries (see, e.g., Vaughn, et al. (1996) Nature Biotechnology 
14(3):309-314, and PCTAJS96/10287), carbohydrate libraries (see, e.g., Liang, et al. (1996) 
Science 274:1520-1522, and U.S. Patent No. 5,593,853), and small organic molecule Ubraries 
(see, e.g., benzodiazepines, Baum (1993) C&EN, Jan 18, page 33; isoprenoids, U.S. Patent 

10 No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent No. 5,549,974; pyrroUdines, 
U.S. Patent Nos. 5,525,735 and 5,519,134; morpholmo compounds, U.S. Patent No. 
5,506^37; benzodiazepines, U.S. Patent No. 5,288,514; and the like). . 

Devices for the preparation of combinatorial libraries are commercially available (see, 
e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, Rainin, 

15 Wobum, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, Bedford, 
MA). 

A number of well known robotic systems have also been developed for solution phase 
chemistries. These systems include automated woikstations like the automated synthesis 
apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic 

20 systems utilizing robotic aims (Zymate II, Zymaik Corporation, Hoplqnton, Mass.; Orca, 
Hewlett-Packard, Palo Alto, Calif.), which mimic the manual synthetic operations performed 
by a chemist The above devices, with appropriate modification, are suitable for use with the 
present invention. In addition, numerous combinatorial libraries are themselves 
commercially available (see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, 

25 Inc., St. Louis, MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek 
Biosciences, Columbia, MD, etc.). 

The assays to identify modulators are amenable to high throughput screening. 
Preferred assays thus detect modulation of lung cancer gene transcription, polypeptide 
e;q)ression, and polypeptide activity. 

30 High throughput assays for evaluating the presence, absence, quantification, or other 

properties of particular nucleic acids or protein products are well known to those of skill in 
the art Similarly, binding assays and rq)ort^ gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high throughput screening metiiods for proteins, 
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U.S. Patent No. 5,585,639 discloses high thiouglq>ut soeening methods for nucleic acid 
binding (i.e., in arrays), while U.S. Patmt Nos. 5,576,220 and 5,541,061 disclose high 
throughput me&ods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available (see, e.g., 
5 Zymark Coip., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 

Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 
typically automate procedures, including sample and reagent pipetting, hquid disposing, 
timed incubations, and final readings of the microplate in detector(s) appropriate for the 
assay. These configurable systems provide high throughput and rapid start up as well as a 

10 high degree of flexibility and customizatiort The manufacturers of such systems provide 
detailed protocols for various high througjiput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 
transcription, ligand binding, and the like. 

In one embodiment, modulators are proteins, often naturally occurring proteins or 

15 fragments of naturally occuiring proteins. Thus, e.g., cellular extracts containing proteins, or 
random or directed digests of proteinaceous cellular extracts, may be used. In ^s way 
libraries of protems may be made for screening in flie methods of the invention. Particularly 
preferred in Ms embodioient are libraries of bacterial, fungal, viral, and mammalian protems, 
with the latter being preferred, and human proteins being especially preferred. Particularly 

20 useful test compound will be directed to the class of proteins to which the target belongs, e.g., 
substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are pq)tides of fix>m about 5 to about 30 
amino acids, with fix>m about 5 to about 20 amino acids being preferred, and Scorn about 7 to 
about 15 being particularly prefrared. The peptides may be digests of naturally occurring 

25 proteins, random peptides, or **biased" random pq)tides. By "randomized" or grammatical 
equivalents herein is meant that the nucleic acid or peptide consists of essentially random 
sequences of nucleotides and amino acids, respectively. Since these random peptides (or 
nucleic acids, discussed below) are often chemically synthesized, they may incorporate a 
nucleotide or amino acid at any position. The synthetic process can be designed to generate 

30 randomized proteins or nucleic acids, to allow the formation of all or most of the possible 
combinations over the length of the sequence, thus forming a library of randomized candidate 
bioactive proteinaceous agents. 
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In one embodimeiit, the library is fully randomized, with no sequence pieferences or 
constants at any position. In a preferred embodiment, the library is biased. That is, some 
positions within the sequence are either held constant, or are selected from a limited number 
of possibilities. In a preferred embodiment, the nucleotides or amino acid residues are 
5 randomized within a defined class, e.g., of hydrophobic amino acids, hydrophilic residues, 
sterically biased (either small or large) residues, towards the creation of nucleic acid bindmg 
domains, the creation of cysteines, for cross-linking, prolines for SH-3 domains, serines, 
threonines, tyrosines or histidines for phosphorylation sites, etc. 

Modulators of limg cancer can also be nucleic acids, as defined above. 

10 As described above generally for proteins, nucleic acid modulating agents may be 

naturally occurring nucleic acids, random nucleic acids, or '"biased"' random nucleic acids. 
Digests of procaryotic or eucaiyotic genomes may be used as is outlined above for proteins. 

In a preferred embodiment, the candidate compounds are organic chemical moieties, a 
wide variety of which are available in tiie literature. 

1 5 After a candidate agent has been added and the cells allowed to incubate for some 

period of time, the sample containing a target sequence is analyzed. If required, flie target 
sequence is prepared using known techniques. For example, the sample may be treated to 
lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 
amplification such as PGR p^ormed as appropriate. For example, an in vitro transcription 

20 witii labels covalentiy attached to the nucleotides is performed. Geno^Uy, the nucleic acids 
are labeled with biotin-FITC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a fluorescent, a 
chemiluminescent, a chemical, or a radioactive signal, to provide a means of detecting the 
target sequence's specific binding to a probe. The label also can be an enzyme, such as, 

25 alkaline phosphatase or horseradish peroxidase, which when provided with an appropriate 
substrate produces a product that can be detected. Alternatively, the label can be a labeled 
compound or small molecule, such as an enzyme inhibitor, that binds but is not catalyzed or 
altered by the enzyme. The label also can be a moiety or compound, such as, an epitope tag 
or biotin which specifically binds to streptavidin. For the example of biotin, the streptavidin 

30 is labeled as described above, thereby, providing a detectable signal for the bound target 
sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

Nucleic acid assays can be direct hybridization assays or can comprise ''sandwich 
assays", which include tiie use of multiple probes, as is generally outlined in U.S. Patent Nos. 
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5,681,702, 5,597,909, 5,545,730, 5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 

5,591,584, 5,624,802, 5,635,352, 5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of 
which are hereby incorporated by reference. In this embodiment, in general, &e target nucleic 
acid is prepared as outlined above, and then added to fte biochip comprising a plurality of 
5 nucleic acid probes, under conditions that allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, including 
high, moderate and low stringency conditions as outlined above. The assays are generally 
■ run under stringency conditions which allow fonnation of the label probe hybridization 
complex only in the presence of target Stringency can be controlled by altering a step 
10 parameter that is a thermodynamic variable, including, but not limited to, temperature, 
formamide concentration, salt concentration, chaotropic salt concentration, pH, organic 
solvit concentration, etc. 

These parameters may also be used to control non-specific binding, as is generally 
outlined in U.S. Patent No. 5,681,697: Thus it may be desirable to perform certain steps at 
15 higher stringency conditions to reduce non-specific binding. 

The reactions outlined herein may be accomplished in a variety of ways. Components 
of the reaction may be added simultaneously, or sequentially, in different orders, witfi 
preferred embodiments outlined below. In addition, the reaction may include a variety of 
other regents. These include salts, buffers, neutral proteins, e.g., albumin, det^gents, etc. 
20 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents fliat otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease mhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target 
The assay data are analyzed to determine the expression levels, and changes in 
25 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the lung cancer phenotype. In one 
embodiment, screening is performed to identify modulators that can induce or suppress a 
particular e3q)ression profile, thus preferably generating the associated phenotype. In another 
embodiment, e.g., for diagnostic applications, having identified differentially expressed genes 
30 important in a particular state, screens can be performed to identify modulators that alter 

e3q)ression of individual genes. In an another ^bodiment, screening is performed to identify 
modulators diat alter a biological fimction of the expression product of a differentially 
expressed gene. Again, having identified the importance of a gene in a particular state, 
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screens are perfonned to identify agents that bind and/or modulate die biological activity of 
the gene product, or evaluate genetic polymorphisms. 

Genes can be screened for those that are induced in response to a candidate agent. 
After identifying a modulator based upon its ability to suppress a lung cancer expression 
5 pattern leading to a normal expression pattern, or to modulate a single lung cancer gene 
expression profile so as to mimic the expression of the gene from normal tissue, a screen as 
described above can be performed to idmtify genes that are specifically modulated in 
response to the agent. Comparing expression profiles between normal tissue and agent 
treated lung cancer tissue reveals genes that are not expressed in normal tissue or lung cancer 

10 tissue, but are expressed in agent treated tissue. These agent-specific sequences can be 
identified and used by methods described herein for lung cancer genes or proteins. In 
particular these sequences and the proteins they encode find use in marking or identifying 
agent treated cells. In addition, antibodies can be raised against the agent induced proteins 
and used to target novel therapeutics to the treated lung cancer tissue santiple. 

15 Thus, in one embodiment, a test compound is administered to a population of lung 

cancer cells, that have an associated lung cancer expression profile. By ^'administration" or 
^'contacting" herein is meant that the candidate agent is added to the cells in such a manner as 
to allow the ag^t to act upon the cell, whether by iq)take and intracellular action, or by 
action at the cell surface. In some embodiments, nucleic acid encoding a proteinaceous 

20 candidate agent (i.e., a peptide) may be put into a viral construct such as an adenoviral or 
retroviral construct, and added to the cell, such that e?q)ression of the peptide agent is 
accomplished, e.g., PCT US97/01019: Regulatable gene flierapy systems can also be used. 

Once a test compoimd has been administered to die cells, the cells can be washed if 
desired and are allowed to incubate imder preferably physiological conditions for some 

25 period of time. The cells are then harvested and a new gene expression profile is generated, 
as outlined herein. 

Thus, e.g., lung cancer or non-malignant tissue may be screened for agents that 
modulate, e.g., induce or suppress a limg cancer phenotype. A change in at least one gene, 
preferably many, of the expression profile indicates that the agent has an effect on lung 
30 cancer activity. By defining such a signature for the lung cancer phenotype, screens for new 
drugs that alter the phmotype can be devised. With this approach, the drug target need not be 
known and need not be rqsresented in the original expression screening platform, nor does 
die level of transcript for the target protein need to change. 
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Measure of lung cancer polypq)tide activity, or of lung cancer or tiie lung cancer 
phraotype can be performed using a variety of assays. For example, the effects of the test 
compounds upon tiie function of the metastatic polypeptides can be measured by examining 
parameters described above. A suitable physiological change that affects activity can be used 
5 to assess the influence of a test compound on the polypeptides of this invention. When the 
functional consequences are determined using intact cells or animals, one can also measure a 
variety of effects such as, in the case of lung cancer associated with tumors, tumor growth, 
tumor metastasis, neovascularization, hormone release, transcriptional changes to both known 
and uncharacterized genetic markers (e.g., northern blots), changes in cell metaboUsm such as 

10 cell growth or pH changes, and changes in intracellular second messengers such as cGMP. In 
the assays of the invention, mammalian lung cancer polypeptide is typically used, e.g., 
mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in vitro. 
For example, a lung cancer polypeptide is first contacted with a potential modulator and 

15 mcubated for a suitable amount of time, e.g., fit>m O.S to 48 hours. In one embodiment, the 
lung cancer polypq>tide levels are determined in vitro by measuring tiie level of protein or 
mRNA. The level of protein is typically measured using immunoassays such as western 
blotting, ELIS A and the like with an antibody that selectively binds to the lung cancer 
polypeptide or a firagment tiiereof. For measurement of mRNA, amphfication, e.g., using 

20 PGR, LCR, or hybridization assays, e.g., northem hybridization, RNAse protection, dot 
blotting, are preferred. The level of protein or mRNA is typically detected using direcfly or 
iadirectiy labeled detection agents, e.g., fluorescentiy or radioactively labeled nucleic acids, 
radioactively or enzymatically labeled antibodies, and tiie like, as described herein. 

Alternatively, a reporter gene system can be devised using a lung cancer protein 

25 promoter operably linked to a reporter gene such as luciferase, green fluorescent protein, 
CAT, or p-gal. The reporter construct is typically transfected into a cell. After treatment 
with a potential modulator, the amount of reporter gene transcription, translation, or activity 
is measured according to standard techniques known to those of skill in the art 

In a prefored embodiment, as outlined above, screens may be done on individual 

30 genes and gene products (proteins). That is, having identified a particular differentially 

expressed gene as important in a particular state, screening of modidators of the expression of 
the gene or flie gene product itself can be done. The gene products of differentially e^qpressed 
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genes are sometimes referred to herein as **lung cancer proteins." The lung cancer protein 

may be a fiagment, or alteroatively, be the full length protein to a fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes is 
performed. Typically, the expression of only one or a few genes are evaluated. In another 
5 embodiment, screens are designed to first find compoimds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be fiulher screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or isolated 
10 gene product is used; that is, the gene products of one or more differentially expressed 

nucleic acids are made. For example, antibodies are generated to the protein gene products, 
and standard unmunoassays are run to determine the amount of protein present. Altematively, 
cells comprising the lung cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a lung cancer 
IS protein and a candidate compound, and determining the binding of the compound to the lung 
cancer protein. Preferred embodiments utilize the human lung cancer protem, althougfh other 
manmialian proteins may also be used, e.g., for the development of animal models of human 
disease. In some embodiments, as outlined herein, variant or derivative lung canc^ proteins 
maybe used. 

20 Generally, in a preferred embodiment of the methods herein, the lung cancer protein 

or the candidate agent is non-diffusably bound to an insoluble support, pref^bly having 
isolated sample receiving areas (e.g., a microtiter plate, an array, etc.). The insoluble 
supports may be made of a composition to which ttie compositions can be bound, is readily 
separated from soluble material, and is otherwise compatible with the overall method of 

25 screening. The siuface of such supports may be solid or porous and of a convenient shape. 
Examples of suitable insoluble supports include microtiter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc, Microtiter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 

30 and samples. The particular manner of binding of the composition is typically not crucial so 
long as it is compatible wi& the reagents and overall methods of the invention, maintains the 
activity of the composition, and is nondifiusable. Preferred me&ods of binding include the 
use of antibodies (which do not sterically block either the ligand binding site or activation 
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sequence when the protein is bound to the support), direct binding to "sticky" or ionic 
supports, chemical crosslinking, the synthesis of the protein or agent on the surface, etc. 
Following binding of the protein or agent, excess unbound material is removed by washing. 
The sample receiving areas may then be blocked through incubation with bovine serum 
5 albumin (BSA), casein or other innocuous protein or other moiety. 

In a preferred embodiment, the lung cancer protein is boimd to the support, and a test 
compound is added to the assay. Alternatively, the candidate agent is bound to the support 
and the lung cancer protein is added. Novel bindmg agents include specific antibodies, non- 
natural binding agents identified in screens of chemical libraries, peptide analogs, etc. Of 

10 particular interest are screening assays for agents that have a low toxicity for human cells. A 
wide variety of assays may be used for this purpose, including labeled in vitro proteui-protein 
binding assays, electrophoretic mobility shift assays, mununoassays for protein binding, 
functional assays (phosphorylation assays, etc.) and the like. 

The determination of &e binding of the test modulating compound to the lung cancer 

15 protein may be done in a number of ways. In a preferred embodiment, the compound is 

labeled, and binding detemiined directly, e.g., by attaching all or a portion of the lung cancer 
protein to a solid support, adding a labeled candidate agent (e.g., a fluorescent label), washing 
off excess reagent, and detenninmg whether the label is present on the solid support Various 
blocking and washing steps may be utilized as appropriate. 

20 In some embodiments, only one of the components is labeled, e.g., the proteins (or 

proteinaceous candidate compoimds) can be labeled. Alternatively, more than one 
component can be labeled with different labels, e.g., ^^I for the proteins and a fluorophor for 
the compoimd. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful, 

25 In one embodiment, the binding of the test compound is determined by competitive 

binding assay. The competitor may be a binding moiety known to bind to the target molecule 
(i.e., a lung cancer protein), such as an antibody, peptide, binding partner, Ugand, etc. Under 
certain circumstances, there may be competitive binding between the compound and the 
binding moiety, with the binding moiety displacing the compound. In one embodiment, the 

30 test compound is labeled. Either the compound, or the competitor, or both, is added first to 
tiie protein for a time sufficient to allow binding, if present. Inciibations may be performed at 
a temperature which facilitates optimal activity, typically between 4 and 40° C. Incubation 
periods are typically optimized, e.g., to faciUtate rapid high throughput screening. Typically 
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between 0.1 and 1 hour will be sufiBcient Excess reagent is generally removed or washed 
away. The second component is then added, and the presence or absence of the labeled 
conq[)onent is followed, to indicate binding. 

In a preferred embodimrat, the competitor is added first, followed by a test 
5 compoxmA Displacement of the competitor is an indication that the test compound is binding 
to the lung cancer protein and thus is capable of binding to, and potentially modulating, the 
activity of the lung cancer protein. In this embodiment, either component can be labeled. 
Thus, e.g., if the competitor is labeled, the presence of label in the wash solution indicates 
displacement by the agent. Alternatively, if the test compound is labeled, the presence of the 
10 label on the support indicates displacement 

In an altemative embodiment, the test compound is added first, with incubation and 
washing, followed by the competitor. The absence of binding by the competitor may indicate 
that the test compound is bound to the lung cancer protein with a higher affinity. Thus, if the 
test compound is labeled, ttie presence of the label on fiie support, coupled with a lack of ^ 
IS competitor binding, may indicate that the test compound is c£q)able of binding to the lung 
cancer protein. 

In a preferred embodunent, the methods comprise differential screening to identity 
agents that are cs^able of modulating the activity of the lung cancer proteins. In one 
embodiment, the methods comprise combining a limg cancer protein and a competitor in a 

20 first sample. A second sample comprises a test corq)ound, a lung cancer protein, and a 
competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the l\mg cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 

25 agent is capable of binding to the lung cancer protein. 

Alternatively, differential screening is used to identify drug candidates that bind to the 
native lung cancer protein, but cannot bind to modified lung cancer proteins. The structure of 
the lung cancer protein may be modeled, and used in rational drug design to synthesize agents 
that interact with that site. Drug candidates that affect the activity of a lung cancer protein 

30 are also identified by screening drugs for the abihty to either enhance or reduce the activity of 
the protein. 

Positive controls and negative controls may be used in the assays. Preferably control 
and test samples are performed in at least triplicate to obtain statistically significant results. 
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Incubation of all samples is for a time sufiScient for the binding of the agmt to the protein. 
Following incubation, samples are washed free of non-spedfically bound material and the 
amount of bound, generally labeled agent deteimined. For example, wh^ a radiolabel is 
employed, the samples may be counted in a scintillation counter to detemiine the amount of 
5 bound compound. 

A variety of otiier reagents may be included in the screening assays. These include 
reagents like salts, neutral proteins, e.g., albumin, detergents, etc. which may be used to 
facilitate optimal protein-protein binding and/or reduce non-specific or background 
interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
10 protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used The mixture 
of components may be added in an order that provides for the req\iisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 
compound capable of modulating the activity of a lung cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising lung cancer 
1 5 proteins. Preferred cell types include almost any cell. The cells contain a recombinant 
nucleic acid ttiat encodes a lung cancer protein. In a preferred embodimmt, a library of 
candidate agents are tested on a plurality of cells. 

hi one aspect, 1ho assays are evaluated in the presence or absence or previous or 
subsequent exposing of physiological signals, e.g., hormones, antibodies, peptides, antigens, 
20 cytokmes, growth &ctors, action potentials, pharmacological agents including 

chemotherapeutics, radiation, carcinogenics, or other cells (e.g., cell-cell contacts). In another 
example, the determinations are determined at different stages of the cell cycle process. 

In this way, compounds that modulate lung cancer agents are identified. Conspounds 
with pharmacological activity are able to enhance or interfere witii the activity of the limg 
25 cancer protein. Once identified, similar stmctures are evaluated to identify critical structural 
feature of the compound. 

In one embodiment, a method of inhibiting lung cancer cell division is provided. The 
method comprises administration of a lung cancer inhibitor. In another embodiment, a 
method of inhibiting lung cancer is provided. The method may comprise administration of a 
30 lung cancer inhibitor. In a fiirther ^bodiment, methods of treating cells or individuals with 
lung cancer are provided, e.g., comprising administration of a lung cancer inhibitor. 

In one embodim^t, a lung cancer inhibitor is an antibody as discussed above. In 
another embodiment, the lung cancer inhibitor is an antisense molecule. 
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A variety of cell growth, proliferation, viability, and metastasis assays are known to 
Ihose of skill in the art, as described below. 

Soft agar growth or colony formation in suspension 
5 Normal cells require a solid substrate to attach and grow. When the cells are 

transformed, they lose this phenotype and grow detached from the substrate. For example, 
transfonned cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a soUd substrate to attach and 

10 grow. Soft agar growth or colony formation in suspension assays can be used to identify 
modulators of lung cancer sequences, which when expressed in host cells, inhibit abnormal 
cellular proliferation and transformation. A th^apeutic compound would reduce or eliminate 
&e host cells' ability to grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft. 

15 Techniques for soft agar growth or colony formation in suspension assays are 

described in Freshney (1994) Culture of Aninnal Te lls a Manual of Basic T echnique (3^^ ed.), 
herein incorporated by ref^CTce. Seealso, the methods section of Gaikavtsev,etal. (1996), ' 
supra^ herein incorporated by reference. 

20 Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until they 
touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 

25 higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
pattern of normal smroimding cells. Altematively, labeling index with (^H)-thymidine at 
saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transfonned cells, when transfected with tumor suppressor genes, regenerate a 

30 normal phenotype and become contact inhibited and would grow to a Iowct density. 

In this assay, labeling index with (^H)-thymidine at saturation density is a preferred 
method of measuring density limitation of growth. Transfonned host cells are transfected 
with a lung cancer-associated sequence and are grown for 24 hours at saturation density in 
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non-limiting medium conditions. The percentage of cells labeling with (^-thymidine is 
detemuned autoradiognqthically. See, Freshney (1994), supra. 

Growth factor or serum dependence 
5 Transfonned cells typically have a lower serum dependence flian their normal 

coimterparts (see, e.g., Temm (1966) J. Natl. Cancer Insti. 37:167-175; Eagle, et al. (1970) I 
Exp. Med. 131:836-879); Freshney, supra. This is in part due to release of various growth 
factors by the transformed cells. Growth factor or serum dependence of transformed host 
cells can be compared with that of control 

10 

Tumor specific markers levels 

Tumor cells release an increased amount of certain factors (hereinafter '"tumor 
specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a hi^er level than from normal brain cells (see, e.g., 

1 S Gullino, " Angiogenesis, tumor vascularization, and potential interference with tumor growth" 
in Mihich (ed. 1985) Biological Responses in Cancer, pp. 178-184). Similarly, Tumor 
angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal 
counterparts. See, e.g., Folkman (1992) "Angiogenesis and Cancer" in Sem Cancer Biol.) . 
Various techniques which measure the release of these factors are described in 

20 Freshney (1994), supra. Also, see, Unkeless, et al. (1974) J. BioL Chem. 249:4295-4305; 
Strickland and Beers (1976) J. Biol. Chem , 251:5694-5702; Whur, et aL (1980) Br. J. Cancer 
42:305-3 12; Gullino, "Angiogenesis, tumor vascularization, and potential interference with 
tumor growth" in Mihich (ed. 1985) Biological Responses in Cancen pp. 178-184; Freshney 
Anticancer Res. 5:111-130 (1985). 

25 

Invasiveness into Matrigel 

The degree of invasiv^ess into Matrigel or some other extracellular matrix 
constituent can be used as an assay to identify compounds that modulate lung cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
30 invasiveness of cells into Matrigel or some other extracellular matrix constituent. In this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor siq>pressor 
gene in tiiese host cells would decrease invasiveness of the host cells. 
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Techniques desaibed in Freshney (1994), supra^ can be used. Briefly, the level of 

invasion of host cells can be measured by using filters coated with Matrigel or some other 

extracellular matrix constituent Penetration into the gel, or through to flie distal side of the 

filter, is rated as invasiveness, and rated histologically by number of cells and distance 

5 moved, or by prelabeling the cells with ^^I and counting the radioactivity on the distal side of 

the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 



Tumor growth in ^dvo 

Effects of lung cancer-associated sequences on ceU growth can be tested in transgenic 

10 or immime-suppressed mice. Rnock-out transgenic mice can be made, in which the lung 
cancer gene is disrupted or in which a lung cancer gene is inserted. Knock-out transgenic 
mice can be made by insertion of a marker gene or other heterologous gene into the 
endogenous lung cancer gene site in the mouse genome via homologous recombinatioiL 
Such mice can also be made by substituting the endogenous lung cancer gene widi a mutated 

IS version of the lung canc^ gene, or by mutating the endogenous lung cancer gme, e.g., by 
e3q)osure to carcinogens. 

A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 
containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 

20 that possess germ cells partially derived fijom the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 
lesion (see, e.g., Capecchi, et al. (1989) Science 244:1288). Chimeric targeted mice can be 
dCTived according to Hogan, et al. (1988) Manipulating the Mouse Embrvo: A Laboratory 
Manual Cold Spring Harbor Laboratory and Robertson (ed. 1987) Teratocarcinomas and 

25 Embryonic Stem Cells: A Practical Approach. , IRL Press, Washington, D.C. 

Alternatively, various immune-suppressed or immune-deficient host animals can be 
used. For example, genetically athymic **nude" mouse (see, e.g., Giovanella, et al. (1974) L 
Natl. Cancer Inst. 52:921), a SCID mouse, a thymectomized mouse, or an irradiated mouse 
(see, e.g., Bradley, et al. (1978) Br. J. Cancer 38:263; Selby, et al. (1980) Br. J. Cancer 41 :52) 

30 can be used as a host. Transplantable tumor cells (typically about 10^ cells) injected into 

isogenic hosts will produce invasive tumors in a high proportions of cases, while normal cells 
of similar origin will not In hosts which developed invasive tumors, cells e7q)ressing a lung 
cancer-associated sequences are injected subcutaneously. After a suitable length of time. 
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preferably 4-8 weeks, tumor growth is measured (e.g., by volume or by its two largest 

dimeosions) and compared to the control. Tumors that have statistically significant reduction 

(using, e.g., Student's T test) are said to have inhibited growth. 



5 Polynucleotide modulators of lung cancer 
Antisense and RNAi Polynucleotides 

In certain embodiments, the activity of a lung cancer-associated protein is 
downregulated, or entirely inhibited, by the use of antisense or an inhibitory polynucleotide, 
i,e., a nucleic acid complementary to, and which can preferably hybridize specifically to, a 

10 coding mRNA nucleic acid sequence, e.g., a lung cancer protein mRNA, or a subsequence 
thereof Binding of the antisense polynucleotide to the mSNA reduces the translation and/or 
stability of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise naturally- 
occuning nucleotides, or synthetic species formed &om naturally-occurring subunits or their 

15 close homologs. Antisense polynucleotides may also have altered sugar moieties or inter- 
sugar linkages. Exemplary among these are the phosphorothioate and other sulfiir containing 
species which are known for use in the art. Analogs are comprehended by this invention so 
long as they fimction effectively to hybridize with the lung cancer protein mSNA. See, e.g., 
Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

20 Such antis^e polynucleotides can readily be synthesized using recombinant means, 

or can be synthesized in vitro. Equipment for such synthesis is sold by several vendors, 
including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 
Antisense molecules as used herein include antisense or sense oligonucleotides. 

25 Sense oUgonucleotides can, e.g., be employed to block transcription by binding to the anti- 
sense strand. The antisense and sense oligonucleotide comprise a single-stranded nucleic 
acid sequence (either RNA or DNA) capable of binding to target mRNA (sense) or DNA 
(antisense) sequences for lung canc^ molecules. A preferred antisense molecule is for a lung 
cancer sequence in the tables, or for a ligand or activator thereof Antisense or sense 

30 oligonucleotides, according to the present invention, comprise a fiagment generally at least 
about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a givoi protein 
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is described in, e.g.. Stein and Cohen (1988) Cancer Res. 48:2659 and van der Krol, et aL 
(1988) BioTechniques 6:958). 

RNA interference is a mechanism to siq)press gene expression in a sequence specific 
manner. See, e.g., Brumelkamp, et al. (2002 ) Sciencexpress (21March2002); Sharp (1999) 
Genes Dev. 13:139-141; and Cathew (2001) Curr. Op. Cell Biol. 13:244-248. hi mammalian 
cells, short, e.g., 21 nt, double stranded small interfering RNAs (siRNA) have been shown to 
be effective at inducing an RNAi response. See, e.g., Elbashir, et al. (2001) Nature 41 1:494- 
498. The mechanism may be used to downregulate expression levels of identified genes, e.g., 
treatment of or validation of relevance to disease. 



Ribozymes 

In addition to antisense polynucleotides, ribo2;ymes can be used to target and inhibit 
transcription of lung cancer-associated nucleotide sequences. A ribozyme is an RNA 
molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes have 

15 been described, including groi^ I ribozymes, hammerhead ribozymes, hairpin ribozymes, 
RNase P, and axhead ribozymes (see, e.g., Castanotto, et al. (1994) Adv. in Phaimacologv 
25: 289-317 for a general review of tiie properties of different ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel, et al. (1990) 
Nucl. Acids Res. 18:299-304; European Patent Publication No. 0 360 257; U.S. Patent No. 

20 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., WO 
94/26877; Ojwang, et al. (1993) Proc. Natl. Acad. Sci. USA 90:6340-6344; Yamada, et al. 
(1994) Human Gene Therapy 1:39-45; Leavitt, et al. (1995) Proc. Natl. Acad. Sci. USA 
92:699-703; Leavitt, et al. (19994) Human Gene Therapy 5:1 151-120; and Yamada, et aL 
(1994) Virology 205: 121-126). 

25 Polynucleotide modulators of lung cancer may be introduced into a cell containing the 

target nucleotide sequence by formation of a conjugate with a ligand binding molecule, as 
described in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, 
cell surface receptors, growth factors, other cytokines, or other Ugands that bind to cell 
surface receptors. Preferably, conjugation of the Ugand binding molecule does not 

30 substantially interfere with the ability of the ligand binding molecule to bind to its 

corresponding molecule or receptor, or block ©itry of the sense or antisense oUgonucleotide 
or its conjugated version into the cell. Alternatively, a polynucleotide modulator of lung 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
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fonnation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 
understood that the use of antisense molecules or knock out and knock in models may also be 
used in soreening assays as discussed above, in addition to methods of treatment 

Thus, in one embodiment, methods of modulating lung cancer in cells or organisms 
S are provided. In one embodiment, the methods comprise administering to a cell an anti-lung 
cancer antibody that reduces or eliminates the biological activity of an endogenous lung 
cancer protein. Alternatively, the methods comprise administering to a cell or organism a 
recombinant nucleic acid encoding a lung cancer protein. This may be accomplished in any 
number of ways. In a preferred embodiment, e.g., when the lung cancer sequence is down- 

1 0 regulated in Imig cancer, such state may be reversed by increasing the amount of lung cancer 
gene product in the cell. This can be accomplished, e.g., by overexpressing the endogenous 
lung cancer gene or administering a gene encoding the lung cancer sequence, using known 
gene-tfaer^y techniques. In a preferred embodiment, the gene therapy techniques include the 
incorporation of the exogenous gene using enhanced homologous recombination (EHR), e.g., 

IS as described in PCTAJS93/03868, hereby incorporated by reference in its entirety. 

Alternatively, e.g., when the lung cancer sequence is iq)*regulated in lung cancer, die activity 
of the endogenous lung cancer gene is decreased, e.g., by the administration of a lung cancer • 
antisense or RNAi nucleic acid. 

In one embodiment, the lung cancer proteins of the present invention may be used to 

20 generate polyclonal and monoclonal antibodies to lung cancer proteins. Similarly, the lung 
cancer proteins can be coupled, using standard technology, to affinity chromatography 
colimms. These columns may then be used to purify lung cancer antibodies useful for 
production, diagnostic, or therapeutic purposes. In a preferred embodiment, the antibodies 
are generated to epitopes unique to a lung cancer protein; that is, the antibodies show little or 

25 no cross-reactivity to other proteins. The lung cancer antibodies may be coupled to standard 
affinity chromatography coliunns and used to purify lung cancer proteins. The antibodies 
may also be used as blocking polypeptides, as outlined above, since they will specifically 
bind to the lung cancer protein. 



30 Methods of identifying variant lung cancer-associated sequences 

Without being bound by theory, expression of various lung cancer sequences is 
correlated with lung cancer. Accordingly, disorders based on mutant or variant lung cancCT 
genes may be determined. In one embodiment, the invention provides methods for 
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ideatifying cells containing variant lung cancer genes, e.g., determining all or part of the 
sequence of at least one endogwous lung cancer genes in a cell. In a preferred embodiment, 
the invention provides methods of idratifying the lung cancer genotype of an individual, e.g., 
determining all or part of the sequence of at least one lung cancer gene of the individual. 
5 This is generally done in at least one tissue of tiie individual, and may include the evaluation 
of a number of tissues or dififerent samples of the same tissue. The method may include 
comparing the sequence of the sequenced lung cancer gene to a known lung cancer gene, i.e., 
a wild-type gene. 

The sequence of all or part of the limg cancer gene can then be compared to the 
10 sequence of a known lung cancer gene to determine if any differences exist. This can be 
done using known homology programs, such as Bestfit, etc. In a preferred embodiment, the 
presence of a difference in the sequence between the lung cancer gene of the patient and the 
known lung cancer gene correlates with a disease state or a propensity for a disease state, as 
outlined hereia 

IS Iq a preferred embodiment, the lung cancer genes are used as probes to determine the 

number of copies of the lung cancer gene in the genome. 

Iq another preferred embodiment, the lung cancer genes are used as probes to 

determine the chromosomal localization of the lung cancer genes. Information such as 

chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
20 chromosomal abnormalities such as translocations, and the like are identified in the lung 

cancer gene locus. 

Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a lung cancer protein or 
25 modulator thereof, is administered to a patient By "therapeutically effective dose" herein is 
meant a dose that produces effects for which it is administered. The exact dose will depend 
on the purpose of the treatment, and will be ascertainable by one skilled in the ait using 
known techniques (e.g., Ansel, et al. (1992) Pharmaceutical Dosage Forms and Drug 
Delivery: Lieberman, Pharmaceutical Dosage Forms (vols. 1-3), Dekker, ISBN 0824770846, 
30 082476918X, 0824712692, 0824716981; Lloyd (1999) The Art Science and Technologv of 
Pharmaceutical Comnounding: and Pickar (1999;^ Dosage CalculationsV Adjustments for 
lung cancer degradation, systemic versus localized delivery, and rate of new protease 
synthesis, as well as the age, body weight, general health, sex, diet, time of administration. 
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drug interaction and the severity of the condition may be necessary, and will be ascertainable 
witti routine experimentation by those skilled in the art. 

A 'patient" for the purposes of the present invention includes both humans and other 
animals, particularly mammals. Thus the methods are applicable to both human therapy and 
S veterinary applications. In the preferred embodiment the patient is a mammal, preferably a 
primate, and in the most preferred embodiment the patient is human. 

The administration of the lung cancer proteins and modulators thereof of the present 
invention can be done in a variety of ways, including, but not limited to, orally, 
subcutaneously, intravaiously, intranasally, transdennally, intraperitoneally, intramuscularly, 

10 intrapulmonary, vaginally, rectally, or intraocularly. In some instances, e.g., in the treatment 
of wounds and inflammation, the lung cancer proteins and modulators may be directiy 
applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a lung cancer 
protein in a foim suitable for administration to a patient. la the preferred embodiment, the 

IS pharmaceutical compositions are in a water soluble form, such as being present as 

pharmaceutically acceptable salts, which is meant to include both acid and base addition 
salts. 'Tharmaceutically acceptable acid addition salf refers to those salts that retain the 
biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 

20 sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
acid, fumaric acid, tartaric acid, citric add, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, etfaanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
'Tharmaceutically acceptable base addition salts'' include those derived from inorganic bases 

25 such as sodium, potassium, lithium, ammonium, calciirai, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the hke. Particularly preferred are the ammonium, 
potassium, sodium, calcium, and magnesimn salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 

30 ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

The pharmaceutical compositions may also include one or more of the following: 
carrier proteins such as serum albunrai; buffers; fillers such as microcrystalline cellulose. 
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lactose, com and ottier starches; binding agents; sweeteners and other flavoring agents; 
coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administ^:ed in a variety of unit dosage 
forms deprading upon the method of administration. For example, unit dosage forms 
S suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that lung cancer protein modulators (e.g., antibodies, antisense 
constructs, ribozymes, small organic molecules, etc.) when administered orally, should be 
protected from digestion. This is typically accomplished either by complexiag the 
molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 

10 packaging the molecule(s) in an ^propriately resistant carrier, such as a Uposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise a lung cancer protein 
modulator dissolved in a phannaceutically acceptable carrier, preferably an aqueous carrier. 
A variety of aqueous carriers can be used, e.g., buffered saline and the like. Ihese solutions 

IS are sterile and generally free of undesirable matter. These compositions may be sterilized by 
conventional, well known sterilization techniques. The compositions may contain 
pharmaceutically acceptable auxiUary substances as required to q)proximate physiological 
conditions such as pH adjusting and buffering agents, toxicity adjusting agents and the like, 
e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, sodium lactate 

20 and the like. The concentration of active agent in these formulations can vary widely, and 
will be selected primarily based on fluid volimies, viscosities, body weight and the like in 
accordance with the particular mode of administration selected and the patient's needs (e.g., 
Remington^s Pharmaceutical Science (15th ed,, 1980) and Hardman, et al. (eds. 1996) 
Goodman and Oilman: The Phaimacolopal Basis of TherapeuticsV 

25 Thus, a typical phamiaceutical composition for intravenous administration would be 

aboutO.l to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per patient per 
day may be used, particularly when the dmg is administered to a secluded site and not into 
the blood stream, such as into a body cavity or into a lumen of an organ. Substantially higher 
dosages are possible in topical administration. Actual methods for preparing parenterally 

30 administrable compositions will be known or apparent to those skilled in the art, e.g., 

l^fttninPtnn'fi PliartTiaceutical Science and Goodman and Gihnan, The Pharmacoloeial Basis 
ofTheraneutics. supra. 
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The compositions containing modulators of lung cancer proteins can be administered 
for therq)eutic or prophylactic treatments. In tfa^peudc ^plications, compositions are 
administered to a patirat suffering fix)m a disease (e.g., a cancer) in an amount sufficient to 
cure or at least partially arrest the disease and its conq)lications. An amount adequate to 
5 accomplish this is defined as a "therapeutically effective dose." Amounts effective for ttiis 
use will depend upon the severity of the disease and the general state of the patient's health. 
Single or multiple administrations of the compositions may be administered depending on the 
dosage and frequency as required and tolerated by the patient In any event, the composition 
should provide a sufficient quantity of the agents of this invention to effectively treat the 

10 patient An amount of modulator that is capable of preventing or slowing the development of 
cancer in a mammal is referred to as a "prophylactically effective dose." The particular dose 
required for a prophylactic treatment will dq)end upon the medical condition and history of 
the Tnamnial, fhe particular cancer being prevrated, as well as other factors such as age, 
weight, gender, administration route, efficiency, etc. Such prophylactic treatments may be 

IS used, e.g., in a mammal who has previously had cancer to prevent a recurrence of the cancer, 
or in a mainTnal who is suspected of having a significant likelihood of developing cancer 
based, at least in part, upon gme expression profiles. Vaccine strategies may be used, in 
either a DNA vaccine form, or protein vaccine. 

It will be appreciated that the present lung cancer jprotein-modulating conq>ounds can 

20 be administered alone or in combination witii additional lung cancer modulating compounds 
or with other therapeutic agent, e.g., o&er anti-cancer agents or treatments. 

Li numerous ^bodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in the tables, such as antisense or KNAi 
polynucleotides or ribozymes, will be introduced into cells, in vitro or in vivo. The present 

25 invention provides methods, reagents, vectors, and cells useful for expression of lung cancer- 
associated polypeptides and nucleic acids using in vitro (cell-free), ex vivo, or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 
expression of a protein or nucleic acid is appUcation specific. Many procedures for 

30 introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, Uposomes, microinjection, 
plasma vectors, viral vectors and other well known methods for mtroducing cloned genomic 
DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, e.g.. 
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Berger and Kimmel, Guide to Mole cular Clomng TechniQues, Methods in Enzvmology 
volume 152 (Berger), Ausubel, et al. (eds. 1999) Current Protocols (supplemented through 
1999), and Sambrook, et al. (1989) Molecular Clnnin^ - A laboratory Manual (2nd ed.. Vol 
1-3). 

5 In a preferred embodiment, lung cancer proteins and modulators are administered as 

flierapeutic agents, and can be formulated as outlined above. Similarly, lung cancer genes 
(including both the full-length sequence, partial sequences, or regulatory sequences of the 
lung cancer coding regions) can be administered in a gene therapy application. These lung 
cancer genes can include antisense or inhibitory applications, e.g., as inhibitory RNA or gene 

10 therapy (e.g., for incorporation into the genome) or as antisense compositions. 

Lung cancer polypeptides and polynucleotides can also be administered as vaccine 
compositions to stimulate HTL, CTL, and antibody responses.. Such vaccine compositions 
can include, e.g., lipidated peptides (see, e.g.,Vitiello, et aL (1995) J. ClinJ Invest. 95:341), 
peptide compositions rac^sulated m poly(DL-lactide-co-glycolide) ('TLG") microspheres 

15 (see, e.g., Eldridge, et al. (1991) Molec. Immunol. 28:287-294; Alonso, et al. (1994) Vaccine 
12:299-306; Jones, et al. (1995) Vaccine 13:675-681), pq)tide compositions contained ni 
immune stunulating complexes (ISCOMS) (see, e.g., Takahashi, et al. (1990) Nature 
344:873-875; Hu, et aL (1998) qit^ Fxp Tmnmnnl. 1 13:235-243), multiple antigen peptide 
systems (MAPs) (see, e.g.. Tarn (1988) Proc. Natl. Acad. Sci. U.S.A. 85:5409-5413; Tam 

20 (1996) J. Immunol. Methods 196: 17-32), peptides formulated as multivalent peptides; 
peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 
vectors (Perkus, et al., p. 379 In: Kaufinann (ed. 1996) Concepts in vaccine development: 
Chakrabarti, et al. (1986) Nature 320:535; Hu, et al. (1986) Nature 320:537; Kieny, et al. 
(1986) AIDS Bio/Technologv 4:790; Top, et al. (1971) J. Infect. Pis. 124:148; Chanda, et al. 

25 (1990) Virologv 175:535), particles of viral or synthetic origin (see, e.g., Kofler, et al. (1996) 
J. Immunol. Methods 192:25; Eldridge, et al. (1993) Sem. Hematol. 30:16; Falo, et al. (1995) 
Nature Med. 7:649), adjuvants (Warren, et al. (1986) Annu. Rev. Immunol. 4:369; Gupta, et 
al. (1993) Vaccine 1 1 :293), liposomes (Reddy, et al. (1992) J. Immunol. 148:1585; Rock 
(1996) Immunol. Todav 17:131), or, naked or particle absorbed cDNA (Uhner, et al. (1993) 

30 Science 259:1745; Robinson, et al. (1993) Vaccine 11:957; Shiver, et al., p. 423 In: 

Kaufinann (ed. 1996) Concq)ts in vaccine development: Cease and Berzofsky (1994) Amu. 
T^ftv Tmnrnnnl 12:923 and Eldridge, et al. (1993) Sem. Hematol. 30:16). Toxin-targeted 
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delivery technologies, also known as receptor mediated targeting, such as tiiose of Avant 
Inununotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvante contain a substance 
designed to protect the antigen from rapid catabolism, such as aluminum hydroxide or 
5 mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 
Mycobacterium tuberculosis derived proteins. Certain adjuvants are commCTcially available 
as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 
MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 
Beecham, Philadelphia, PA); aluminum salts such as alimiiuum hydroxide gel (alum) or 
10 aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 
tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 
polyphosphaz^es; biodegradable microspheres; monophosphoryl lipid A and quil A. 
Cytokines, such as GM-CSF, interleukin-2, -7, -12, and oflier like growth factors, may also be 
used as adjuvants. 

1 5 Vaccines can be adnodnistered as nucleic acid compositions wherein DNA or RNA 

encoding one or more of the polypeptides, or a fragment thereof is administered to a patient. 
This approach is described, for instance, in Wolff, et. al. (1990) Science 247: 1465 as well as 
U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739.118; 5,736,524; 5,679,647; WO 
98/04720; and in more detail below. Examples of DNA-based delivery technologies include 

20 ^'naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, cationic lipid 
complexes, and particle-mediated ("gene gun**) or pressure-mediated delivery (see, e.g., U.S. 
Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the invention 
can be expressed by viral or bacterial vectors. Examples of expression vectors include 

25 attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 
vaccinia virus, e.g., as a vector to express nucleotide sequences that encode lung cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the iromunogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. 

30 Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are 
described in Stover, et al. (\99V\ Nature 351:456-460. Awidevariety of other vectors useful 
for thmpeutic administration or immunization e.g., adeno and adeno-associated virus 
vectors, retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the 
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like, will be apparent to those skilled in the art from the description herein (see, e.g., Shata, et 
al. (2000) Mol Med Today 6:66-71; Shedlock, et al. (2000) L Leukoc. Biol. 68:793-806; 
Hipp, et al. (2000) In Vivo 14:571-85). 

Methods for the use of genes as DNA vaccines are well known, and include placing a 
5 lung cancer gene or portion of a lung cancer gene under the control of a regulatable promoter 
or a tissue-specific promoter for expression in a limg cancer patient. The lung cancer gene 
used for DNA vaccines can encode full-length lung cancer proteins, but more preferably 
encodes portions of the Ixmg cancer proteins including peptides derived from the lung cancer 
protein. In one embodiment, a patient is immunized with a DNA vaccine comprising a 

10 plurality of nucleotide sequences derived from a lung cancer gene. For exanq)le, lung cancer- 
associated genes or sequence ^coding subfragments of a lung cancer protein are introduced 
into e?q)ression vectors and tested for their immunogenicity in the context of Class I MHC 
and an ability to generate cytotoxic T cell responses. This procedure provides for production 
of cytotoxic T cell responses against cells which present antigen, including intracellular 

15 epitopes. 

In a preferred embodiment, DNA vaccines include a gene encoding an adjuvant 
molecule with the DNA vaccine. Such adjuvant molecules include cytokines that increase 
. the inmiunogenic response to the lung cancer polypeptide encoded by fhe DNA vaccine. 
Additional or alternative adjuvants are available. 

20 In another preferred embodimeiit lung cancer genes find use in generating animal 

models of lung cancer. When the lung cancer gene identified is repressed or diminished in 
metastatic tissue, gene ttierapy technology, e.g., wherein antisense or inhibitoiy RNA directed 
to the lung cancer gene will also diminish or repress expression of the gene. Animal models 
of lung cancer find use in screening for modulators of a lung cancer-associated sequence or 

25 modulators of lung cancer. Similarly, transgenic airimal technology including gene knockout 
technology, e.g., as a result of homologous recombination with an appropriate gene targeting 
vector, will result in the absence or increased expression of the lung cancer protein. When 
desired, tissue-specific expression or knockout of the lung cancer protein may be necessary. 
It is also possible that the lung cancer protein is overexpressed in lung cancer. As 

30 such, transgenic animals can be generated that overexpress the lung cancer protein. 

Depeadxag on the desired e3q)ression level, promoters of various strengths can be employed 
to express the transgene. Also, the number of copies of the integrated transgene can be 
determined and compared for a determination of tiie e3q>ression level of the transgene. 
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Animals generated by such methods will find use as animal models of lung cancer and are 
additionally useful in screening for modulators to treat lung cancer. 



Kits for Use in Diagnostic and/or Prognostic Applications 

For use in diagnostic, research, and therapeutic q)plications suggested above, kits are 
also provided by the invention. In diagnostic and research appUcations such kits may include 
at least one of the following: assay reagents, buffers, lung cancer-specific nucleic acids or 
antibodies, hybridization probes and/or primers, antisense polynucleotides, ribozymes, RNAi, 
dominant negative lung cancer polypeptides or polynucleotides, small molecule inhibitors of 
lung cancer-associated sequences, etc. A therapeutic product may include sterile saline or 
another pharmaceutically acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing instructions (e.g., 
protocols) for the practice of die methods of this invention. While the instructional materials 
typically comprise written or printed materials diey are not limited to such. A medium 
capable of storing such instructions and communicating them to an end user is contemplated 
by this invention. Such media include, but are not limited to electronic storage media (e.g., 
magnetic discs, tapes^ cartridges, chips), optical media (e.g., CD ROM), and the like. Such 
media may include addresses to internet sites that provide such instructional materials. 

The present invention also provides for kits for screening for modulators of lung 
cancer-associated sequences. Such kits can be prepared fix>m readily available materials and 
reagrats. For example, such kits can comprise one or more of the following materials: a lung 
cancer-associated polypeptide or polynucleotide, reaction tubes, and instructions for testing 
lung cancer-associated activity. Optionally, the kit contains biologically active lung cancer 
protein. A wide variety of kits and components can be prepared according to the present 
invention, depending upon the intended user of the kit and the particular needs of the user. 
Diagnosis would typically involve evaluation of a plurality of genes or products. The gaies 
typically will be selected based on correlations with important parameters in disease which 
may be identified in historical or outcome data. 
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Example 1: Gene Chip Analysis 

Molecular profiles of various normal and cancerous tissues were detemiined and 
S analyzed using gene chips. RNA was isolated and gene chip analysis was perfonned as 

described (Glynne, et al. (2000) Nature 403:672-676; Zhao, et al. (2000) Genes Dev> 14:981- 
993). 
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Tables 1A and IB were previously filed on April 18, 2001 In USSN 60/284,770 (18501-001500US) and on November 29, 2001 in USSN 60/334,370 
(18501-001S20US) 
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0L97 


1.55 


109412 


AA227145 


Hs^73 


ESTs: Weakly similar to REGULATOR OF MIT 


076 


1.87 


110780 


N23174 


Hs.22891 


solute canler family 7 (caSonto andiw 


09 


OK 


110958 


N50550 


Hs^4587 


s^nal transduction prot^ (SH3 cont^ 


1.17 


226 


111018 


N54067 


Hs^ 


milogen-acfivated prot^ litoase Mnase 


1.21 


1.65 


111337 


N79612 


Hs.16607 


ESTs; Highly similartoMyosin heavy cha 


1 


1.45 


112305 


R54822 


Hs^6244 


ESTs 


1 


1 


112401 


R61279 


H5^37536 


ESTs; Weakly simiar to F25B5 J [Celega 


1.24 


1.64 


112853 


T02843 


Hs.4351 


EST 


1.56 


1.96 


112869 


T03313 


Hs.4747 


dysker^osis cor^enlla 1; dysioeitn 


1.03 


1.57 


112992 


T23513 


Hs.7147 


ESTs 


1 


1 


113048 


T25B95 


Hs.184008 


ESTs; Weakly similar to RNA-blnding prat 


1.37 


226 


113083 


T32438 


H&SQ27 


ESTs 


1 


1 


113179 


T551B2 


Hs.152571 


ESTs; Highly similar to IGF-ll mRNAtind 


1.33 


27 


113573 


T91166 


H5.15990 


ESTs 


0.76 


1.47 


11^11 


W4492B 


Hs.4878 


ESTs 


0.79 


1.51 


114086 


Z38266 


Hs.12770 


Homo sapiens PAC dons 0lI0777O23 fiom 7p 


09 


1.34 


114587 


AA070827 


HS.1B0320 


ESTs; Wk^ simiar to GOLGt 4-TRANSMEM 


1.02 


1.76 


114846 


AA234929 


Hs.44343 


ESTs 


1.32 


236 


114964 


AA243873 


Hs.82184 


ring finger protein 3 


1.1 


1.84 


1ia)47 


AA252827 


Hs.22554 


homeo box. 65 


1.01 


23S 


115166 


AA258409 


Hs.ig8907 


niyelin protein zenvlkel 


1.05 


231 


115167 


.AA25B421 


HS.4372B 


hypcdheScat prot^ 


1i2 


252 


115239 


AA278650 


Hs.73291 


ESTs; W^aUy similar to simSar to the b 


07 


257 


115278 


AA279757 


Hs.67466 


ESTs; Weakly simQar to BACN32G11 J pjn 


1.14 


212 


115552 


AM05098 


Hs,38178 


ESTs 


082 


4.67 


11OT75 


AA433943 


Hs.43946 


ESTs; Wealdy similar to Wealc sirnSai^ 


1.2 


1.98 


116004 


AA449122 


Hs.76088 


ESTs; Kghly similar to small zinc finge 


096 


1.31 


116121 


AM59254 


Hs.48855 


ESTs 


097 


1.S 


116129 


AA459956 


Hs.49163 


ESTs; Highly similar to pulaBve iflxmuc 


1.08 


273 


116190 


AA464963 


Hs.67776 


ESTs 


08 


1.57 


116312 


AM90494 


Hs.65403 


ESTs 


1.37 


265 


116732 


F13779 


Hs.1 65909 


ESTs 


0.92 


1.8 


117602 


N35O20 


Hs.44585 


ESTs; WeaJdy simBar to GOUATH PROTEIN 


1.15 


1.84 


117950 


N51394 


HSJ5478 


KlAA09%protdn 


1.04 


238 


117992 


N52000 


Hs.172089 


HomD sapiens mRNA; cONA DKFZp586BQ222 (r 


062 


1.29 


118785 


N75386 


Hs.1 11867 


GLM(ruppei famOy member QJ2 


1 


1 


119717 


W59134 


Hs.57987 


ESTs 


1 


1.4 


119814 


W74069 


H8.56350 


ESTs 


078 


1.77 


120128 


Z3B499 


KSJ1448 


MKP-1 IBe protdn tyrosine phosphalase 


086 


1.46 


120242 


298443 


H5»86386 


ESTs 


083 


201 
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15 
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40 
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65 

t 

70 
75 
80 
85 



120483 


AA2S2994 


Hs.1578 


apOpiDSB BiniDUOf 4 tSIBVtWy 


121054 


AA39S604 


Hs57387 


ESTs 


121326 


AA404246 


fe9703l 


ESTs; W^aUy simBar to SMar b phylD 


121376 


AM05699 


Hs.1^232 


ESTs; Moderately similar to SODIUM- AND 


121457 


AMI 1446 




ESTs 


121760 


AA422086 


H$.1 24660 


ESTs 


121781 


AM221S0 


Hs.98370 




121844 


AA425732 


III. f\QJioe 


gapluncfion prot^ beta% 26kD (conn 


122059 


AA431737 


KS.9S749 


EST 


122338 


AM43311 


hS.SS99o 


ESTs 


122354 


AA443772 


Hs.166892 


ESTs 


122591 


AA453265 


Hs.99311 


EST8;VroaXvsiini]artoMnJ [H.sapi6ns] 


122790 


AA450156 


Hs.99556 


ESTs 


123396 


AA521265 


H5.1(^14 


ESTs 


123518 


AA608531 


Hs.170313 


ESTs 


123673 


AA609471 


4 4 ¥n4f% 

Hs.112712 


ESTs 


124000 


D57317 


Hs.74861 


acSvaldd RNA pdymBrase tl transcripllo 


124367 


N24006 


Hs.99348 


cfistaMass homeoboxS 


124447 


N48000 


H5.140945 


Homo sspta mRNA; cOKA DKFZp586ll41 (fr 


125756 


W25498 


Hs.81634 


ATP synthase; Ho transpof&ig; mOochond 


125769 


Ai382g72 


Hs.82128 


5T4 oncofetal trophotdsst glycopidein 


125852 


H0B290 


Hs.76550 


Homo sapiens mRNA; cONA OKFZp56461264 (r 


125924 


AAS26849 


Hs^2109 


Qfndecan 1 


12G037 


MB5772 


Hs.6066 


KIAAI 112 protein 


126214 


N^455 


Hs.74316 


desmoplaldn (DPI; DPIQ 


126414 


N78770 


Hs.223439 


ESTs 


126737 


AA488132 


Hs.62741 


ESTs 


126743 


AA179253 


Hs.172182 


pdyt^-biRdlng pn^n; cytoplasmic 1 


128928 


AA179546 


Hs.632 


ESTs; Highly sinfflar to WTBSRIN BETA^ 


127432 


AA501734 


Hs.170311 


hsterogeneous nuclear ribonudeoprotein 


126218 


H02682 


Hs.991^ 


EST^ Moderately amlar to reoomtiina&} 


128527 


M31523 


Hs.101047 


IranscnpBon factor 3 (E2A immunoglobul 


128568 


X60873 


Hs.247568 


adenytele khase 3 


128584 


Ml 1433 


Hs.101850 


letind-Unding proton 1; cellular 


128628 


C14037 


Hs.251978 


EST 


128891 


W27939 


HS.103S34 


ESTs 


128714 


VOOS^ 


Hs.179661 


Homo sapiens done 24703 beta^ulwlln mR 


126733 


AA328993 


H&l 04558 


ESTs 


128781 


X85372 


Hs.l 05465 


SRi^ nudearrtionucleoprotan polypepl 


129052 


AA496297 


Hs.182740 


ribosomai prot^ S11 


129095 


L12350 


HS.108S23 


Qvombospondin 2 


12S241 


AA43^65 


Hs.l 09706 


ESTs; MooeraraysniuarD HNl [Alinuscu 


129665 


M88458 


Hs.1 18778 


KDEt (LysAsp-Gbhleu) endopiasmie relic 


129703 


AA401348 


Hs.1 79999 


ESTs 


129720 


AA476582 


H5.12152 


ESTs; Moderately SBiuarto 90iAL RECOG 


129B50 


N^3 


Hs.56845 


GDP dissoctalion lnhibHor2 


129896 


AAD43021 


Hs.l 3225 


llDP-<33t:t)6taGicNAc beta 1;4- gatactosytt 


130069 


AA055696 


Hs.1 46428 


collagen; type V; alpha 1 


130405 


H88359 


HS.15539S 


nuclear factor (e/ythrold-dedved 2)^ 


130541 


X05608 


Hs211584 


neuromamein; Eght poiypepode {6Bkp) 


130599 


M91670 


Hs.174070 


id)iquiiin carrier pnriein 


130867 


J04093 


Hs.2056 


UI^ gtycosyliransferase 1 


131009 


AAa63596 


Ks.22142 


ESTs; Weakly simSar to NADH-CYTOCHROA^ 


131028 


U20240 


Hs.2227 


CCAAT/enhancer tending proton (C/EBP); 


131083 


U66661 


Hs.22785 


gamma-aminobutyTtc add (GABA) A rece^ 


131091 


T35341 


Hs.22880 


ESTs; Highly similar to olpepfidyl pepQ 


131144 


C14412 


Hs.23528 


ESTs; H^hly Mlar to HSPC038 protein 


131148 


G0003B 


Hs.23579 


ESTs 


131164 


Y00503 


Hs.182265 


l(eraBn19 


131185 


M2S753 


Hs^60 


cydnBI 


131219 


CX)0476 


Hs24395 


smaS inducibie cytokine subfamiy B (Cy 


131454 


AA455896 


Hs.2699 


glypican 1 


131687 


LI 1066 


Hs.3069 


heat shock 70kD protein 98 (moitsOn>2} 


131689 


AA599653 


H5^696 


transcdpfion factor-'flkB 5 (basic hcfix 


131692 


D50914 


Hs.30736 


KlAA0124pralein 


131786 


AA135554 


Hs^212 


ESTs 


131843 


AA195693 


H8.1840S2 


ESTs; Moderately similar to putaSve Rab 


1318G0 


U02082 


Hs.334 


Oncogens TIM 


131884 


H90124 


Hs^63 


titjosomal protein S23 


131903 


AA481723 


Hs3436 


deleted in oral cancer (mouse; homolog) 


131945 


MB7339 


Hs.35120 


repBcaSon factor C (acSvator 1) 4 (37 


131958 


AA093998 


Hs.3566 


ESTs; Highly Mlar to phosphoryiaBon 


131954 


W42508 


Hs.3583 


ESTs 


132001 


J00277 


Hs^003 


v41aH^ Harvsy rat sarcoma ^rird (Miooge 


132040 


AA146843 


Hs.1 72894 


BK3 Intoracnng donran oeain agonist 


132065 


OS222B 


Hs^ll594 


protssorre (prosome; macropam) sutxi 


132109 


AA593B01 


LJ^ J AMI A 

Hs.40098 


ESTs 






Hs.40154 


jumonji (mouse) honuioQ 


132123 


AA447123 


Hs^05 


ESTs 


132162 


K89551 


Ks.41241 


ESTs 


132180 


AA4055^ 


Hs.418 


fSvotilast acGvalian protein; alfdia; se 




AA460917 


Hs.2780 


junOpfoto-OROogene 


132371 


AA235448 


Hs.46677 


ESTs 


132618 


AA253330 


Hs^ 


adaptor-retatsd protdn complex 1; Ociiyiia 


132738 


U68019 


H&21157B 


MAD (molhers against decapentaplegic; Dr 



0.74 


1.64 


1JI5 


1.93 


0l98 


1.3 


0l91 


1^ 


(LSI 


1.9 


0.46 


0.90 


1.07 


1.54 


0.94 


A A 

1.4 


1.93 


Zoo 


4 

1 


4 

1 


0.88 


1.49 


2.28 


2Si 


(X88 


1.3 


1 


1.93 


1 


1 


1 


1.15 


0.74 


1.12 


0:67 


1.1 


1.19 


1.7 


0.93 


1.59 


1.65 


6.76 


0172 


Z26 


\2i 


2.25 


1.36 


1.63 


1.93 


3.55 


1.21 


1.66 


1 


1 


1.3 


Z18 


2.53 


28 


1.57 


Z12 


1.24 


2.09. 


1.08 


1.78 


1^ 


3l48 


0.87 


Z42 


1.22 


1.9 


1.1 


1.73 


0.92 


1.17 


1.34 


1.94 


0.9 


1^ 


2.59 


3b19 


1.04 


3.2 


0.95 


U1 


1.28 


Z63 


0.97 


1.63 


1.09 


1.79 


0.74 


1.68 


1.43 


4.19 


1.17 


1.98 


1J26 


1.79 


1 


1 


1.07 


1.66 


1 


4.6 


0.93 


1.05 


1 


1.23 


1.1 


1.8 


1.28 


1.98 


1.43 


2.08 


0.88 


3L38 


1.19 


Z77 


0.86 


3.84 


0.65 


198 


0L99 


1.54 


1 


1.18 


1 


1.95 


1.55 


2.39 


1 


1.33 


0.83 


1.63 


1.08 


Z2 


1.23 


1.24 


0.91 


1.18 


1 


Z8 


0.67 


1.36 


1 


1.25 


1.12 


1.43 


1 


1.55 


0.89 


1.27 


1 


1.05 


HOD 




\S& 


Z4S 


1.08 


Z46 


1.02 


456 


1.16 


1.8 


as 


1.26 


Ol5 


1.49 


1.21 


1.81 



90 
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AA488432 


nsJabWf 




4 
1 


1 1 




U785IS 






AM 


1 i% 




T23841 


nS.DUDO 


rUAAi 1 u prolan 


1.10 


1 in 

I'M 


132359 


AA028i03 


Hs£1472 


ESTs; WeaWy airiter to urJcnown [Sxerev 


4 IM 


1J30 




AAS05133 


Hs.7594 


- -i -J- . - • in.r»aii O fT- !niLljL-l ^1 

soutfi carrier lanay 2 {lacuiatea giu 


UL/c 


£.97 


133005 


C21400 


Um 4A9Ma 

HS.10339 


VIA AMnft iiinlutn 

KIAA097Q protetn 


Aoa 


1 U 




Afi25x> 


U* 4T0fiQn 


JlniMiliJn* Hpmf L«««^f-«S« j*lntw« fenced 

didcyigiyoaQi nnsse, aipns (ouxuj 


AOS 


1 77 




N70633 


nS.o4oo 


chaperoflin oonf^nfng TCP1; 8itf)un9[2 0^ 


1 iA 


1 7IR 

i.ro 


133086 


LI 71 31 


HS.139SUU 


h{0h-inoUEIy gnx^ (nontdstons chromoso 


0.97 


4 At 

1.43 




789703 




DMA InfufinM «««M4tf nmlaHi Q 

UNA tSUaatQ tOOW fHOlBSi 0 


4 4 

I.I 


1 B 
1.0 


133195 


AA35U744 


nS.1oi409 


K1AA1007 prolan 


990 


9 AO 
£.09 






Hs.70704 


CCTe 
CbIS 


1 (17 
i.or 


1 RR 
1.00 






Lie 4CfiC7C 


riboscmaj prt^ssn L14 


rtlK 
Wi09 


1.10 


407il?B 


013370 


Hs.73722 




not 


1 AC 


133445 


T99303 


Hs,73797 


gu^iins nucisc^d b'ndinQ protebi (G pr 


AOA 


1 RR 
1.00 


1J040J 




nS./w/U 


KBraun 


flfK 
U.OJ 


1 1A 
1.14 


133492 


L40397 


Hs.74137 


tfsnsiTiBrnbrans trsfDcUng prot^ 


4 4 

Id 


1 AO 
1.09 


IJJOtW 


UUDCA7A 






n? 
u>/ 


R91 


133517 


X52947 


Ks.74471 


jiDoon pnnan; apa 4jkd (oon 


noc 


1 % 

1.d 


133540 


D78151 


Hs.74619 


proteasoms (prosoms; iiiaaopaln) 26S subo 


A 01 


1 9K 
1.20 


133594 


L07758 


ns. 1/^dt» 


nudcsr phosphoprotein sbnSar to S. osr 


AAA 


1 90 

1.29 


133627 


U09587 


Hs.75280 


■ J inUA nirinfi 

glycyHKNAsyntnetase 


1 no 


1 00 
1.99 


133671 


T25747 


Hs.75471 


sncungerprDiBtn 140 


I.UZ 


4 K 

1.0 


133859 


U86782 


Hs.17o^1 


26S (KOlB8sofnB*assoddBd padi homolog 


1 11 
1.11 


OM 


133865 


F09315 


Hs.1 70290 


discs; IsgB (DrosophBa) honttlog 5 


1 u 
1.W 


e7 
Oi7 


133913 


W84712 


Hs.7753 


caSumerun 


1 4B 

1.19 


1 DC 
1.00 


133963 


L34587 


Ms. iMoSJ 


iranscnpoon ewiQaoon laaor o voiu} 


1.4 


1 01 

1.91 


133982 


U47621 


tlm *UV7*>C4 

ns.20725i 


nudedar autDanSQon (59(0) sbrdar to 


1 4 
1.4 


1 Oft 
1.SI9 


134100 


107540 


HS.! 710/9 


repucauon iacioro(acuvaa)r ij t>{So 


A79 
Ih7< 


1 ftC 
1.00 


134110 


U41060 


Hs.79138 


UV-l prtdsin; estro^Bn regirialied 


1 AA 
1.W 


1 ft9 
1.02 


134156 


U15174 


Hs.79428 


DoL2/aaenoviru5 c IB iHOHnteracQfiffpn) 


4 

1 


1 KC 

1.99 


134161 


U97188 


Hs.79440 


IGr<4] mRNAHnnoing protein o 


MJSZ. 


1 oe 
1.99 


134193 


F09570 


Hs-7980 


ESTs 


0.98 


4 AB 

i.4o 


134357 


X54199 


ns.oZ209 


lAosphoribosylgtydnarnntefonrQfHTansfBr 


4 

1 


9R 


134402 


U^165 


H5.82712 


t W ^ ■ * 1 _ , J '* ■ ,11, ml 

wBBBB A msrtia retaraadon, aiBjsomai 


4 VI 

1.20 


2 


134457 


D86963 


Hs.174044 


(fidisvdlsd 3 (hotnotogous to DrosopMla 


4 

1 


1 A 
1.47 


134469 


X17567 


Hs.83753 


smal] nudear rfbonudeoprotein pdtfps^A 


034 


4 ei 


134498 


M63180 


Hs.84131 


mreonyl-tRNA syninetase 


4 4 

1u( 


9CA 

2.04 


134501 


VV64870 


Hs^11568 


eukaiyoCc translaSon irafiation factor 


A Oil 

0.84 


4 4e 
1.96 


134507 




Ll» o;i44fi 


iqwcaoon prnnn Ai (rOXLQ 


1 7 
l.f 


90^ 
2.M 


134548 


U41515 


Hs.85215 


uei8iSQ in spHrnana/spBi-ioa i legio 


1 Aft 
1.W 


979 
2.70 


134599 


X99226 


Hs.86297 


FanoonI anemia; oon^demsnlatfon group A 


iJo 


2.22 


134692 


R73567 


Lk> oficn 
nS.oo5U 


a (fisintsgrin and ffletaUopfOtBinasa dona 


A77 


1 RA 
1.04 


134693 


K70361 


Hs.8854 


ESTs 


1.09 


1.82 


1J4tHlD 


74QAQ0 


nS.09f 10 


^Mhiiurb symnase 


flOR 
U.90 


l.vO 


134821 


Z34974 


Hs.198382 


ptetophlEn 1 (edodennaldysptasia/Bidn 


a99 


1.4 


134864 


Y08999 


Hs.90370 


ac&i related protein 2/3 oormtex; subun 


aos 


1.42 


134914 


U29615 


H5.91093 


chJtift3se1(cli»obkis)das8) 


1.16 


1.29 


134953 


L10678 


Hs5t747 


p7Gfiln2 


m 


1.76 


134993 


AA282343 


Hs.9242 


purine-ddi dament binding prol^ B 


a98 


1.73 


135051 


C15324 


Hs.93668 


ESTs 


1.35 


2.11 


135158 


U51711 




Human (tesfTX)CdBn-2mRNA; J UTR 


0.86 


1.16 



Table 1 8 shows the accession nuinbers for those pkeys b Table 1 A lacka>g unlgeneiO's. For each probaset we have tteted the gene chjstsr number from which the 
oligonudc»tktes were designed. Gene dusters were compiled using sequences derived from Genbadt ESTs and mRNAs. Th^ sequences were dusteiad based on'sequence 
sinnilarity using Qustering and Angnmenl Tools (DoubteTw^ The Genbank accession raunbers for sequences cornprising each ctuster are 

AcoGSSion column. 



Pkey: Unique Eos probeset Identifier number 
CAT number. Gene duster number 
Accession: Genbankaocesskmnumbeis 



Pkey 



CAT 



Accessions 



100661 23182.1 BE623001 105096 AA3B3604AW966416N53^AM6021 3 AW571519AA603655 

100667 26401.3 105424 X56794 S66400 X55150 W80071 AW351820 X55938 MB3326 BE005289 BE070059 M83324 BE00S248 BH)69717 BE161648 BE069700 
AW606203 B£069721 AW382138 AW803776 6E463954 BE005334 BE005274 T2n86AA932714 AA972695 AW377728 AtS32506T29066 
AI7B3934 AW377727 BE163715 AL047291 AA279047 AA523003 BE008048 8E440141 W23814 6E090519 BE092193 N29181 N20358 N441S3 
BE546944 T69231 AW377441 AA907406 H50799AW051416AI420712BE620922 A1279161 AA992549 W47198 BE005241AI 342698 H50700 
Aig69974 A1863855 AA374490 AW130&75'Ai950633 AA146687 H99482 X55150 BE0D5414 BE005339 N28^ A1673068 A18878g0 AW804171 
At675981 AW804172 AA778841 AU)48050 A1127757 AI095568 AW204965 AW468978 W31 898 Ai05;S95 AI278771 BE46401 8 Ai081 503 AI824196 
AA513211 AA411062 AW084376 N48752 AA7D3209 N35S80 AVV05991B AA054563 AI280942 T^^ 

AA283090 AA962S36 H82726 W52115 W45432 W60433 AA577548 AA146714 BE150994 AA054615 AW796025 AW382768 BE565671 C00444 
AA054555 

100568 26401 J L05424 X56794 S65400 X55150 W60O71 AW351820 X55938 M833^ BE005289 BBSmS9 M83324 BE0O5248 fiE069717 BE181646 BE069700 
AVV606203 BE069721 AVV3B2138 AW803776 BE463954 BE005334 BE005274 T27388 AA932714 AAg72695 AW^^ 
AI783934 tmirm BE163715 AL047S1 AA279047 AAS230O3 BE008048 BE440141 W23614 BE090519 BE0S2193 K29181 N20358 N44153 
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104667 


AI239923 


Hs.30098 


ESTs 




104404 


H5S762 




gD:c5rD0D57 i-EdVv Homo sapiens cuMA ome 




104392 


AA07d049 


Hs.274415 


■ ■ - ,, _t ..J^kl A CI 1 4 A44Q fJnn n Ue 

Homo sapierts cDNA rUluZZd ns, done HE 


27.20 


104212 


AB0D2298 


HS.173U35 


KlAAuouU protein 




104074 




Hs^l^ 


(tono saidfins mRNA: cDMA DKFZd434M229 ^ 


11.20 


103749 


AL135301 


Hs.8768 


liypothsticai protein FIJ10&49 
succina!a<CQA Sgasa. G0P4Drm!ns. elpha 


10£6 


103845 


AW246^ 


HS.7M3 


12D0 


103554 


AI878826 


Hs^3469 


caveoOn 1, cavedae protein, 22ld3 




103541 


AI815601 


Hs.79197 


CD83 anfigen (acfivatsd B tympiucytes, i 




103496 


Y09267 


Hs.132821 


ftavin contalnino monooxygenase 2 




103428 


BE383507 


H5.78g21 


A kinase (PRKA) anctwr proton 1 


11.20 


103353 


XBS399 


Hs.119274 


RAS p21 pratdn acSvator (GTPase acSva 


19J0 



17.20 

aeo 



&00 



3.91 



1J3 



aeo 

11.40 
4.76 



7.13 
7.00 



29.80 
3.70 



a30 

ao9 

5.40 
7.60 



10.20 
S.69 
a82 
420 



1.76 



105 



2.40 
1.78 
1.76 
219 



1.94 
1.7S 
247 



1.92 



1.93 



151 



1.80 
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103295 


XB1479 


Hs2375 


103280 


U84722 


Hs.75206 


103100 


H^L005574 


HS.1645B5 


103D25 


NM_002837 


Hs.123641 


102698 


Ml 8667 


Hs.1667 


102659 


BE2451Q 


Hs^1610 


102580 


1160808 


Hs.1 52981 


102417 


AA034127 


HS.1534o7 


102353 


N&jLw3734 


Hs. 198241 


102302 


AA30o342 


Hs.69171 


102283 


AW161552 


Ks.83381 


102188 


U20350 


Hs.78913 


102151 


T27013 


Hs.3132 


101957 


L2^4 


Hs.74101 


101842 


M93221 


Hs.75182 


101771 


NM_0u2432 


HS.153S37 


101754 


AI196550 


Hs.81256 


101716 


AP0S0858 


Hs.2563 


101678 


hssssos 


Hs^lBI 


101447 


M21305 




101383 


KM_000132 


Hs.79345 


101346 


AI738616 


Hs.77348 


101345 


NM_005795 


Hs.152175 


101336 


NM.006732 


Hs.75678 


1013^ 


L43821 


Hs.80261 


101277 


BE^626 


H8.296049 


101^ 


L35854 




101168 


NM.005308 


Hs.211569 


101102 


NM.003243 


H8.79059 


101088 


X70697 


HS.S53 


101066 


AW9702S4 


HsJ89 


100971 


BE379727 


H5.63213 


100893 


8E245294 


Hs.180789 


100770 


W25797xonip Hs.177488 


100716 


X69887 


Hs,172350 


100555 


M69181 




100425 


NM_014747 


HS.7B748 


100408 


D86540 


Hs.56045 


100382 


D83407 


Hs.155007 


100351 


D64158 




100299 


D49493 


Hs^171 


100134 


AA305746 


Hs.49 


100108 


U09577 


Hs.75873 


100095 


Z97171 


H6.78454 


100086 







eyMto inodidB oonbdiikQi nniclrwGkB, 
cadherin 5> ^rpo 2, VE-cadherin (vascda 
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pTDtdn tyro^ne phosphatass, reoepbirt 
pnjQSstitetn (pspsinoQsn Q 

trtptet repeat RNA-tsnding pr^ 
CDP-<S8cylg1ycen)l synOtase (ptosphaSda 
s^iul trsns(hting ateptof mo ls o ute (^t3 
annirecgMase, copper cortaniifl 3 (vase 
protein kmase 0^2 
guanine n ud e o O Je binding protnn 11 
chemoldne (C-X^rec^ 1 
slsioidogeric acute regidatoiy protein 
Sfdsen ^fiosine kinase 
mannoee leoeptor, C type 1 
myeUd oeD nudear dSerenfiaSon ant 
SI 00 cdchin-Undlng piotsin A4 (catcton 
tadqfMidi^ piGCiBSor 1 Substance Ki su 
ooniplsinsntoQfnponentSieceptDf 1 (CSal 
gb:Kuinan alpha satei^s aid satdEte 3 
ooagidatsn factor VUl. procoagUant oo 
hydrosqfprostaotandin dehydrogenase 15-(N 
ca!dtQn&i receptor-Ste 
FBJ nwine ostsosajooma viral oncogene h 
enhancer of GlamsntaSon 1 (cas-Gkedo 
niicrD&br{Ilar-^socist8d protein 4 
gbiHuman dystrophin (dp140) mRNA, 5* end 
G pnilBin-coujded receptor kinase 5 
iransforrring growth factor, beta re^to 
sduta earner family 6 (neurotransnvtte 
Gharot-Leyden crystal protein 
(atty add binding protein 4. adipocyte 
S164protebi 

amyloid beta (A4) precursor protein (pro 
HIR Oiistone cell cgdlB regulalbn defec 
gtnHuman nonmusde niyoshi heavy chai^ 
KIAA0237 gene product 
src homoiogy three (8H3) and cysteine ri 
Down syndrome critical region gene 1-fik 

gnwfh difbrentiaSon factor 10 
nocrophage scavenger receptee 1 
hyaturonogtucQsaminUase 2 
rnyocSin. trabecular nieshwoik indudbie 



3^ 



PCT/US02/12476 



1.76 



25.40 
14.00 

10.68 



16.40 
15.40 



18.80 
504.80 



19.38 

15.40 
1170 
14.80 
33.00 
16.20 



7.40 



31.00 



7.52 



1.78 
2.22 

1.75 
124 

101 

1.91 



11.29 



4.00 
4.24 
6.20 
2170 



5.40 



1.79 



TABLE SB shows the accession numtiers for those primekeys laddng unigenelD^s for Table 3A. For each probeset we have Dsted the gene duster number from which the 
oOgonudeolides were designed. Gene dusters were oompited usbig sequences derived from Genbank ESTs and mRNAs. Tliese sequences were dusteied based on sequence 
dmSarftyusingGbstBrvvandAlignrneritToo^ The Genbank accesstoninterBfiv sequences cornprisfngea^ 

'Accession' column. 



Pkey: Uiuque Eos probeset identifier number 
CAT number Gene duster number 
Accessbn: Genbank accesskm numbers 



Pkey CAT number Accessions 



AA602984AA609^ 
AA325606AA099517N89423 
H04043 060986 D60337 
AA248234AA090985 
AA399961 AA128347 
AA393283AA398628 

AA611B04AAa09404AA286907AW977S24 
AA431082 

AA226198 AA22851 3 AA383773 
AA620448 
H50834 



123619 371681J 

126433 127143J 

125831 1522905J 

126816 122973J 

126852 136135J 

121059 273450.1 

120637 200885.1 

122011 7617.-2 

120934 177521.1 

123802 genbanKJ\A620448 

116614 genbanlLH50834 

118329 genbanlLN63520 

104404 H58762_at H56782 

104776 genbankJ\A02634g 

113502 genban)LT89130T89130 

101262 entiez^l.^854 L35854 

108573 genban)LM086005 

101447 entre^M21305 M21305 

124357 genbanUC2401 

108781 genb3n)LAA128654 

112794 genbaniLR97018 

100351 entnBU}64158 D64158 

10(S55 ^.HT2245 M69181 M81105 U51039 



AA026349 



AA086005 

N22401 

AA128854 

Rg7018 
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Talte4A show 2(go8nesu|Hegu)atedta8ampto faro paBen^ These genes ware selected bomSSSSOprebesets on the 
EDsMfiymetiixHuOSGened^aray. Gene expression date Ibr each pid»etobt^ from this 
the relaSve tsvet of ntf^ expressbn. 

Pkey: tb^e Cos pr(d}esetid6ii&fier number 

BcAocK , Exemplar Accession nunber* Genbank accession number 

Ut^genelD: Unigene number 

Uiljene Title: Unigene gene 69b 

R1: average of Alfv samples from paSente treated wQichemcAieiapy or radloOierai^ 



Ptoy ExAccn UnigenelD Un{geneTTfle R1 

1001 ]3 NM.001269 Hs.64746 chromosome condensaSao 1 27 JO 

1001B7 D177S3 Hs.76183 a!dt>4(eto reductase family 1,menAerC3 20.60 

100210 D2&381 Hs.3104 KIAA0042 gene product . 20.40 

100225 D28539 Hs.167185 gtutamate receptor, metaboirapic 5 20.60 

100269 NhL001949 Hs.11B9 E2F transcripSon tedor 3 3.40 

100438 AA013(ei HsJ1417 topoisoinefase{ONA) II bfaKfing protein 

100677 XS0821 Hs27973 K1AA0874 protein 

100893 BE245294 HS.1807S9 S164 protein 43.40 

101273 Z11933 Ks.18^ POU domain, class 3. transoq>8on facto 21.80 

101447 M21305 gb:Kuman alpha sat^ and saieiae 3 19160 

101649 AW95990B H8.169D heparin^nding growth factor binding pr 38.40 

101724 L1 1690 H5.620 buDous pemptugoid antigen 1 (23Qfi40M)) mSO 

101748 NM.001944 Ks.192S desmoglam 3 (pemphigus vulgaris anOgen 78.60 

101809 M6S849 Ks^33 gap juricaon protein, beta 2. 26kD (conn 16120 

101879 AA176374 HsJ43868 ' nuclear autoanfigenic spenn protein (his 50.00 

101915 AF207881 HS.15518S cytDsoCc ovarian caidnoma antigen 1 26.00 

101973 U41S14 H&80120 UOP^i-acetyMpha^alactosaminerpolyp 37^ 

102025 U04045 H5.78934 mutS(E.coG)homoIog 2 (colon cancer, 

102031 U04898 Ks^156 RAR^ated orphan receptor A 32.00 

102052 NM_002202 Ks.505 ISII transcription tector. tJMAnmeodama 51^ 

102391 AA296874 Hs.77494 deoxyguano^ kinase 13.90 

102420 U44060 Hs.14427 Homo sa^ns cDNA: RJ21 800 6s. done H 28.80 

102610 U65011 Hs.30743 prsferenliaOy expessed anSgen In mete 110.60 

102829 NM_006183 Hs.80962 neurotensin 11&80 

103000 NhL0019^ Hs.146580 enolase 2, (gamma neumndQ 2.30 

103036 M13^ Hs.83169 matrix metaOoprotebiase 1 (intersfifial 181.40 

103507 AJ000512 Hs.296323 semm/^hicocofficoid regulated kinase 49.20 

103587 BE270266 HS.8212B 5r4onoofi9tallrDphoUastglyoQproiain 86.60 

104660 BE298665 H8.14846 Homo sapiens mRNA:cONADKFZpS64D01 6 (fr 42.60 

104896 AWD15318 Hs.23165 ESTs 29.40 

105038 AW503733 Hs.9414 KIAA1 488 protein 21.50 

105299 BE387790 HsJ26369 hypotheacal protein FU20287 32.80 

105510 Z42047 Hs.283978 HQmo8apbnsPR02751 mFtNA,completood3 20.20 

105667 AA767526 Hs.22030 palnsd box gene 5 (B-ceflDneage specif 28.40 

100)73 AL1S7441 Hs.17834 downstream neighbor of SON 25.40 

106205 AWg6505B Hs.111583 ESTs. Weakly shnilar to 138022 hypotheti 32.00 

106516 AL137311 Hs.234074 KomosaptensmR^M:cONAOKFZ|;>76tG02t21 ( 40.60 

106533 AL134708 Hs.145938 ESTs 59.80 

108575 AW970602 Hs.105421 ESTs 43.40 

105654 AW075485 Hs.286049 phosphoserine aminotransftae 50.K) 

106851 A1458623 gbjkD4g09Jc1 Na_CGAPJLii24Hom08aptens 53.40 

1069^ AB023139 Hs.37892 K1AA0922 prot^ 2038 

107332 TB7750 Hs.183297 DKFZP566F2124prot^ 23.60 

107532 AA443473 Hs.173884 Homo sapiens mRNA; cONA DKFZp762G207 (fr 57.20 

107922 BE153855 Hs.61460 Ig superfamliy receptor lillR 49.00 

108S09 BE409857 Hs.69499 hypotheScal protein 19.67 

108760 AU076442 Hs.1 17938 coQagen, type XVH. alpha 1 48.17 

109166 AA2ig691 Hs.73625 RAB6interac6ng.kine£irv^{rafaUnes 59.20 

109260 AW978515 Hs.131915 KIAA0863 protefli 28.60 

109280 AK001355 Hs.279610 hypothetical prel^aJ10493 22M 

109292 AW975746 H$.188662 KIAA1702 protein 

109384 AA219172 Hs.86849 ESTs 21.00 

109415 U60736 Hs.110a26 tr!nucleofidefep6atGOfltdnteg9 31.60 

109445 AA232103 Ns.189915 ESTs 24.20 

109502 AW967069 H&211556 bypolheBcal protein MGC5487 21.40 

109633 AW003785 Hs.170267 ESTs 20.40 

109765 AI989482 Hs.146286 Idne^MymentolSA 19.60 

109956 AA0012K Hs.133521 ESTs 24.00 

110920 N47224 i^s^Zi HMT1 (hnRNPmeihyttr8nsterasa.S.cerevi 28.40 

110924 AWQ58463 Hs.12940 zbx>8noers and homeoboxes 1 36.00 

111084 H44186 Hs.15456 PDZ domain containing 1 61.20 

111132 AB037807 HsJ3293 hypothettcal protein 24.60 

111229 AW389845 Ks.110855 ESTs 27.20 

111337 AA837396 Hs.263925 USUiteracSngprotanNUOEI. rat homo 48i)0 

111987 NIyL015310 Hs.6763 KIAA0942 prolan 37.80 

112046 AA383343 Hs.22116 CDC14(ceO(fivisioncyct9l4,S.cerevi 28£0 

112268 W39609 Hs.22003 80bil8canierfamDy6(nsurotransmitte 63.80 

112^ R87650 Ks.33439 ESTs.VteaIdyMartoAliJ1_HUMANALJU 26.40 

112871 A1110216 HS.122B5 ESTs, Weaktysiito to 155214 salivary 47.64 

112897 AW205453 Ks.3782 ESTs 22J)0 

112973 AB033023 Ks.3iai27 hypotheficafprotetoRJ 10201 65.00 

112992 AL1574S Hs.133315 HomDsiQ)tensmRNA;cDNAO}<FZp761J1324fr 42:00 

113073 N39342 Hs.103042 iriaotebutMSSodated pratefai IB 55.40 
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5 
10 
15 
20 
25 
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40 
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65 
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113494 


T91451 


Hs»86538 


ESTs 


M Oft 


11^60 


T91015 


Hs.268626 


cSTS 


tjn on 


113849 


AA457211 


KS.B8M 


bcQRKidofnaio at^jacsent to zbic finjjGy dcHoa 


£4 OA 
31.00 


11SS) 


AI267652 


Hs^0504 


t «- . II ii'ii „,. _BAi A. mnUA ni/c7iBA4iicnft'k j& 

Horno sapieiB miviA; ojNA uKr^MSwOBZ p 


40.2U 


114339 


AA782845 


Hs.22790 


ESTs 




114000 




nStlODM 


nypOuBaSi prOini nJl4ttf/ 


CI.W 


114459 


K37908 


lis 07icie 


cqTSi Weaxiy similar » ALUdLnUAwvi aui s 


OS on 


114518 


AW1b^o7 


Hs.106469 


suppress of varl (Sxereirisia^ 3499 


» OA 


114824 


AA980961 


Hs,305953 


zinc finger piuMn 83 (HPF1) 


27.20 


114837 


DC2449^ 


nS.lD&o35 


c5TS 


iSJL2U 


114974 


AVV365931 


Hs.i79oo2 


nudfiosomB assenfbly pratin 14kB 1 


'M on 
2D.O0 


liaQTa 


AAoisiMd 


n5.00U45 


cSTS 


on cn 


11S)84 




(^42484 


nypotneSca pretBD FUIudIB 


OO 00 

2o.6o 


115291 




nS.1 22379 


nypou^uca proton rUiWoj 


OO AA 
00.W 


115313 


A ADflSnAI 

AAoUoOul 


Hs.lo44il 


albumin 


OO OA 


11K97 




HS.Q3325 


transmsmbfand protBSSG, serine 4 


1TO CA 


115909 




H&59761 


cbTs, weany sinuiar d QAri_nUMAn DcATH 


ZfJl 






rS.o1Zo2 


CCTa 

cSTS 


OA OA 


118107 


Al 4MB«g 


Hs. 172572 


nypansQca pretaoi ru^JiSf j 


164.20 


11S399 


AASB9120 


Hs.1 10637 


tomeo boxAlO 


38.00 


11709 


K93699 




^^^^^4 0^4 4 ..4 ^ f.i ii.l ^ * 

gb^Tvi 6a1 1 ^1 Scares tsm uver spleen 


21.^ 


117881 


AF1 61470 


11-. ^eAfioo 

n8^6UD22 


iRiiyiawwQuceo transcnpi 1 


49.40 


118091 




HS.47BB3 


CCTb lAlmJbfci iilmllm 111. LII iUAil PAI 

cSTB, weaHy anuar b ma4I.JiUman lalu 


04 AA 


llDlOO 




Hs33560 


noniosapiensnwiAior NAAi// 1 prosn, 


99 Aft 








gb:za49d07.8l Soares fetai Ever spleen 


OA AA 


118873 


AI824009 


Hs.44577 


ESTs 


19.40 


119126 


R45175 


HS.1171B3 


ESTs 


111.20 


119717 


AA318317 


Hs.57987 


A _-JI 1 J. __n_L _A t J JA /— J_ a * B 

bHseii ClX/iynipnonia 1 16 (zinc finger pro 


33.(K) 


119940 




Hs^2531 


DKFZP58630319 protam 


31.00 


120266 


AI807264 


Hs^05442 


ESTs, Weakly simflar to T34038 hypoflieS 


20.20 


120515 


AA25o3w 




-L_. _f«V. 4#\ ■■ < AA.Lnmi. 04 Lljum. 

gb2i59c10.si Soares JfiHMPu^Si Homosapi 


S.00 


120859 


AA826434 


Hs.1619 


achaete-sctits complex (DiDsoptala) homo) 


95.40 


120983 


AA398209 


Hs37587 


EST 


105.20 


121054 


AVra7657U 


HS.97387 


ESTs 


38.80 


121369 


All/if enT4Y 

AW450737 


K5.128791 


CO-09 protein 


41.60 


122335 


A A 

AA44o258 


Hs^41551 


cmonoecnonnfii, caicuini auuvaieQi lani 


30.80 


122612 


AA974o3Z 


Hs.128708 


ESTs 


40 en 

19.00 


123130 


AA4d72uO 




^:ab19tQ2.si Stfaiagena ling ^37210} H 


33.20 


123440 


A17336S2 


Hs.1 12488 


ESTs 


23.17 


123696 


AA4Z1|JU 


HS.11S40 


EST 


OO AA 

2J.00 


123619 


AA602964 




gD3K}97cQ2.s1 NCLCGAP_Pr2 Homo sapiens 


28.80 


124006 


A1147155 


lie OTniMC 


ESTs . 


TT OA 


124169 


BE079334 


rb.271630 


ESTs 


22^ 


124281 


AI333756 


Hs.1 11801 


arsenate resistace prolan ARS2 


42.20 


124472 


^Q2517 


Hs. 102670 


EST 


32^ 


124617 


AW^168 


H5.1 52684 


ESTs 


<M on 
21.60 


124631 


NM_014053 


Hs .270594 


rLVCRpTOiBin 


30.40 


124839 


R55784 


Hs.140942 


ESTs 


21^ 


125186 


AA610620 


Hs.181244 


m^or h2stoconq)afiUQ)f complex, dass 


^80 


12S321 


TB6652 


Hs.1 78294 


ESTs 


27.00 


125535 


kill t\4VtJ^ 


HS.2Z2i3 


secretograinn 111 


23.60 


125646 


AAdZo9o2 


HS.752U9 


pro^ nnase {cAMr*o^)enaeni, cataiyD 


23.20 


125684 


AvV5o94Z7 


lit. ICDQilQ 

nS.1 58849 


1 tn.-nJL r nrJj-xrm jnfMJAi CI I04CCO A* jilnnn. 

rtomo sapKtRS cONA: rU2iDo3 ds, clone c 


21 uS) 


125724 


AL380190 


Hs79597B 


Homo safxens mRNA nn length insert CON 


48.80 


12S47 


Awl 61 885 


HS.249034 


ESTs 


31.00 


125934 


AA1S3325 


ns.32646 


_ tV,-.iL-^l - -- A - * - 1 A4 AAA 

nypo&ietical pnotem FL/21 901 


21.20 


126077 


M7S772 


1 1_ A J AAA A 

Hs.210836 


ESTs 


49.60 


126299 


AVV9791S 


H&29B275 


amino add IranspQitaf 2 


21.80 


126395 


AI468004 


Hs.278956 


iiypotheOcal protein FU12929 


71.00 


126433 


AA325606 




gb:EST28707 Cerebellum D Homo sapiens c 


23.20 


126509 


R47400 


Hs23850 


ESTs 


23.80 


126538 


AB03065& 


Hs»17377 


ooroidn, acBn4)inifing pfotdn. 1C 


23.10 


126666 


AA648886 


Hs.1 51999 


ESTs 


36.00 


126812 


AB0378m) 


Ks.1 73933 


— . - A- - - #^ 1 fA 

nudear factor UA 


20.80 


126872 






gb:Ut4^«i3'dM-124HiU1 Nd.OQv'.Sii 


46.29 


127046 


AA321948 


Hsj29396B 


ESTs 


22.80 


127431 


AW77199 


Hs.175437 


ESTs, Moderately simQar to PC4259 rem 


3CL00 


127489 


A ACBMEA 

AAdSDZSD 


Hs.272076 


ESTs 


^.80 


127521 


AvV^7208 


Hs.164018 


ESTs 


25.20 


127742 


AUinfMilQQ 


rtS.iou1w 


ESTs 


28.00 


127925 


A AOfte4C4 

AAoUoidl 


Hs.3628 


mfiogen-adivatBd protein Unase kinase 


21.20 


127930 


A APOQCyi 

AAKBd/2 


1 1- 4«MOftA 

Hs.1 23304 


ESTs 


ZiM 


127968 


AA8X1201 


Hs.124347 


ESTs 


28.20 


127987 


AKEZiOS 


Hs.124511 


ESTs 


19.60 


1^16 


KO7103 


Hs.285014 


Homo sapens, done IMAGE:3oo724j, mRNA 


20.40 


128^ 


kill t^OCiC 


HS.10245O 


survival of motor neuron protein InlBrac 


34.40 


128777 


A{678916 


HS.10S26 


cysteine and glydine^ protein 2 


53.80 




AA0(S647 


14e Often 


a disbitegrtn erid metatloiiMOiBflBse doma 


OO nn 
tMN 


129168 


A)132988 


Hs.109052 


chromosome 14 open reading frame 2 
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Homo sapiens cONA: FIJ22373 fe, done H 


20.40 


132084 


NM.002267 


Hs.3888 


kaiyqpherin alpha 3 {bnpoilln alpha 4) 
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TABLE 4B shows the accession nunters tor those prtmdceys laddng urigendirs forTabJe 4A. For each probesetwe have listed the gene duster number Ibm which the 
dgonudeofidesweredeagned. Gene dusters were coiTif^ using sequences derived fnxnGent)^ These sequence were dusteiedtiased on sequence 

simOaiity using CSustering and Alig^unent Tools (DoubleTwisI, OaMand CaKonii^. The Genbank accession numbers for sequences oorapitsing each duster are listed in the 
'Access i onT cohinut 

Pkey: Udque Eos probeset identifier number 
CAT number Gene duster number 
Accession: (Senbank accession numbra 



Pkey CAT number Accessions 

123819 371681J AAfi02984 AA609200 

126433 127143J AA32S06 AA099517 N89423 

126872 142S98J AW450979AA136653AA136656AW419381AA984358AA4920736E168945AAB09054AW238038BE011212BE011359 

BE011367 BB)11368 BE011382 BE01 1215 BE01 1385 BE011363 

106851 322947.1 AI458823 AA639708AA485409 R22065AA48S57D 

118720 genbank^N73515 N73515 

120515 genbanlU^A258356 AA258356 

117099 321871.1 H936S9 H97976 H80036 

101447 entreUA21305 M21305 

123130 genbanUU\487200 AA487200 
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Pk^ Unique Eos probesetidanSSsrnmnber 

ExAocrc E)fflnyiaf Accession ntnntoi Genbartk accBSston nmnfaer 

Ibu^nellk UnQens nuniber 

UnjeneTIBe: UnlgenegeneEle ^ ^ 

R1: 708) pen£n6!e of Alfv squamous odcareinoma end adsracareinm 

d ise as e d iunQ sainples. 
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Rat 8081 peroentie of A) squamous oeQcardnonnatui^tunKrsaT^^ 

R4: SOffiperoenfile of Aladenocardnoma lung tunnr samples dMdedlqr the 8081 perc^^ 

RS: 'TO&iperoenSteofAlfarsquainousceScardnamaandadenQcsciiRanatui^tu^ 

diseased lung and turner samptesdivkled by Sffibpenxnfile of Al for nomi^ samples minus the ISttiperoenSle of Al for aB 

normal kng. dmnicaOy diseased lung and tumor samples 
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mtrdchranosome maintenance deficient (S. 
phosphofoicfoldnase, pialelet 
proteasome (prosome, macropain) suburb 
E2F transcription factor 3 
chaperontn containing TCP1, sutxinU 5 (e 
protein disulfide isomerase-ielatBd prot 
mUchromosome maintenance defident (S. 
piatBieKacDvaiiiQ tacBT acevtnywQs 
uridine monophosphate lAiase 
KIAA0175 gene product 
amylase, alpha 2A; pancreatb 
RAN. member RAS oncogene family 
nonHTetastafic cells ^ protabi {NM23Q 
carcinoembryornc anSgeoHialatBd ad 
prdacGn-induced protein 
coUagen. type VII. alpha 1 (epidennolys 
catc&onlnfealcitofdn-relatedpolyp^fid 
iritogen-flctivated preteln kinase Idnase 
Homo sapiens ilbosorasi protein 139 mRNA. 
dnc ritibcn dom^n conteinir^ 1 
general transcripOon factor ItA, 1 (37k 
myeloki/lymphdd or mb^d-Gneage ieu)(Bm 
K1AA0618 gene product 
flap stnjcture^pecrfc erslonudease 1 
gb:Human transketoiase^lke protein gene 
ret protCHmcogene (multlpte endocrine n 
guanine monpho^ihate synttietase 
kerafin 14 (epidermolysis iJuHosa simpte 
gb:Human pmSforaiing ceS nudear anS 
ghicose phosphate isomerase 
potasshim votSage-gatad channel. Shab-re 
protease inhibitor 3. skin-deilved (SKAL 
melanoma antigen, tenuly A, 2 
macrophage migraSon Inhlt^ factor ( 
atasdalelairiffactaste group D-associated 
ojMd receptor, mu 1 
cyc&Htependent kinase kihUiitDr 3 {CDK 
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keraOn 5 (e^emio)ysis bulosa simplex 
bone morphogenete protein 2 
gkitami&^x^oacetk: transaminase 2, nd 
interferoiKindxed protein wah tetratri 
gt>:Human paraffiyroid hormone-related pro * 
asparagine synthetase 
acorutase 1, soiuUe 
ta)iQarin 

v-ros avian UR2 sarcoma vims oncogene h 
heparin^ibKlIng growth tector binding pr 
HS Kslooe f^, member 0 
H2A hstone fondly, meninr A 
growth flned and OIIAr<lamage-hdudble; 
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5.67 
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4.17 
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38.60 
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4.01 

4.46 
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02983 BE387202 Hs.1 18638 
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03099 AI693251 Hs.K46 

03119 X63629 Hs.2877 

103168 X53463 Hs.2704 

03185 NM-006825 Hs.74368 

103192 M22440 Hs.170009 

103223 B£275607 Hs.1708 

103242 X76342 Hs.389 

103316 XB3301 Hs.324726 

103375 NM.00^2 Hs.54416 

103376 A1036166 H9.323378 
103385 NiyL007069 Hs.37189 
103391 X94453 Hs.114366 
103404 BE394784 Hs.76596 
03430 BE564090 HS.2071G 
103446 X98834 Hs.79971 

103476 Y077D1 Hs.293007 

103477 AJ011812 Hs.119018 

103478 BE514982 Hs.38®1 
103515 Y10275 Hs^07 
103558 BE616547 Hs.2785 
103580 AA328046 Hs.46405 
03567 BE270266 Hs^128 
03594 AI%8680 Hs^16 
1038% KM_008235 Hs.2407 
103768 AF086009 

103841 AA314821 Hs.38178 

103847 AF219946 Hs.102237 

103913 AW967500 Hs.133543 

104094 AA418187 Hs^lS 
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TABL£ SB shotn tteaooesston raimbers tor those prirnehBys lacking unigeneltys for Table 5A. For each probesel we have listed the gene duster number from which (he 
oGgonudeotides were designed. Gene clusters were cornpOed using sequences derived from Genban^ These sequences were clustered based on sequence 

siiiia%usineCtusteriiig and AEgnmerrt Tools (Doi^ The Genbankaocessioa numbers for sequences oornprisng each duster are Bsted in the 

'Aooession' cobimn. 

Pfcay: Unique Eos profaesetUenSiermnnber 
CAT number Geneduster number 

Genbank accessnx) numbers 
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117079 
124305 
101502 
109792 
126034 
102768 
126345 
127066 
127099 
119243 
125675 
112054 
126979 
12^ 
122318 
114699 
114793 
108305 
108393 
100867 
123731 
109700 
120715 
113702 
115113 
101045 
108554 
108573 
119052 
126522 



1621717J 

242183_t 

182Q2_-« 

754958 1 

1598157J 

44641.1 



1703458J 
244301 1 
1774795.1 
1566433 1 
1538292.1 
171411 1 
680655 1 
292419.1 
13532Z.1 
15074^.1 
111550.1 
113411.1 
tigr.HT4588 
genban)LM609839 
genban)Li^)9609 R}9609 
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R43590F10439 
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AI809521 H12174Z42556 
AA429743AA442754 
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AA034948 
AA0B6005 
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AA845593 Ai62371 1 N68583 C00084 AA193557 AW083888 AW163216 AA191S95 AA522778 A1628008 At915518 AA843508 AI926195 
AA176265AW167963AAS92115VV93647AWO3572AI862994AI342059AA911719AA1761S5AAO24712AA0^ AI591107 
AI199673A1811766AI275832AH22233A)191BS2AI096682AI580124AI683512AA5B2453AAS27559AA488415T3241^ 
H44848 H20477 T91695 W47039 AA0700S AA02479S AA328855 AA379248 AA379330 AA3855B0 W2S20 W03688 AA448359 AA093881 
AW362477 AA08^ AI350265 W93479 NW688 AA932257 AW351469 H68590 AA663402 AA069771 AW087986 AI858420 AA600214 
AI970774 A1857712 AI683081 AI885584 AW131150 AI5S7981 AW002714AW1 89973 AW075495 AW1 68303 AA95371 4 AW51 6881 AI357375 
AI566663 AW512576 A1570580 AI023690 AA44821 6 AI079853 A1422707 AA77951 6 AW026972 AW130G82 AW1 62307 AW438646 AA709332 
AW92394AI157350AI217879AI129152AA719509AI350480AA663418AI003634AW118546AA180261AA442833^ 
AI038759 AA846723 AI248770 AA993S94 AI280335 AIB85107 AW51 8649 AA&41563 AAS95835 AA58^ AI276744 AA43S478 A1017360 
AI620763 A1859887 N73926 AI076327 Ar74161 5 All 60817 AW172819 A{4920(£ AA677429 AA^6334 AI693771 AB50039 AI245629 AI288515 
AI886186 T93293 AA1 73262 AA599779AI680ra2 AW439316 AI084555 AI272fi72 A158^7 AW473219 AA738132 AW473283 AI367432 
AAS95410 Af689S24 AA206353 AI033095 AI040382 AA873630 AI221074 AI934340 AI418680 AA844306 R94503 AA773520 AA843169 
AA21 9425 AA629658 AI81 1719 AW41 1 275 AI590981 W37907 AI591 178 AI884051 AAS83238 AA669347 AA976239 AA704570 A1628339 
A1884391A)241580A1003539AWM76687AA009650 N34555A1333493AI186070AA070827AM11663A128K^ 
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119 



wo 02/086443 PCTAJS02/12476 

AM38789AA232172AW360778W2S882 R6Q282AA436S30AA378SMM167451 AIS4^ K88243N84281 
AA209340 NS6174 N88374 AA191088 AVV247691 AA24g013AA093111 AA972S36 AV^^ 
AI288829 AA843996 W1S2G0AI1882B6 AW248079 R1S836 

11S599 genbanlLW45552 W455S2 

112382 Qe(A»n]LR59904 R5S904 

10S264 seRban)LAA227834 AA227934 

100071 enlieO^02 A2B1Q2 

123315 714071.1 AM98389AA496646 



TabteGA shorn 99 genes up-regulated nonsmdlcerewi^ Ttise genes were selected ta 59680 probesds on the 

EosiA^metnxHuDSGenschip array. Gene expresskn data for each prebesetobtainsd torn Qiisana^ 
the relaQve level of mRNA expression. 
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Homo sapiens cONA: FU23004 68. done L 


17.92 
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PCTAJS02/12476 



121558 


AM124S7 




gDas»9i^' aosresjensjvii rnnosap 






iZioTd 


H56037 


nS.iQoi40 


CCTm 
COlS 






121938 


AI(a4D00 


H5.98S12 


ESTs 


15.00 




121938 




Hs,SSoiO 


ESTs 


\A ftfl 




122177 


AA435789 




EST 


fi n 




123442 




HSil 11496 


t > i,,i1,,nr n^il/l R l<<eJ4lk> -1 « tT- 

nospo sapiens cuNA FUI id43 bs, done He 






123551 


AAfi08837 




giKauoniZ-si soaiesjesus.NHT Homo sap 


1150 




1237^ 


AA60S971 


Hs. 112795 


EST 






123851 


AAd20o40 










124371 


N24924 




c5TS 


txaU 




127477 


BE32B720 


nS.2cHJo9l 


cola 




AT) 


127591 


AMonciin 
AI19Dd40 








MIc 


128252 


AA455924 


KS. 192220 


cols 


*.W 




1284^ 


AI255784 


nS.145i97 


colS 






128925 


Kd74i9 


IJ» OIDCI 


nOfTio sapiens CUNA nj i^mju i6i cKjne N 




9 11 


128945 


Ai&oncftc 
AI99UOUO 


nS.oO/7 


nonio sspens uikna, cuna ui\r£p94fcio4 


innn 




129105 


AlToSioO 


IJ- 4nOC<H 


Honto s^iiens brsin tumof tftsoddtttd (vot 


4c en 




123235 


AUUDI7799fi 

Faw9tf£lo 


n8.icDUM 


lUnnims pnHBDl 






129506 


ABQ206B4 


Hs.11217 


NAWo// |»DISui 






1^95 


U09550 


Hs.1154 


ovKnssiai giycoprcxein i, izuKU(Fnuan 9 






130160 


AA305bo8 


nS.207o95 


UUr*vSCD6iav^CNAC 0613 it<>^aECl05yiu 


/ULUU 




1^840 


D82326 


Hs.239iQo 


sotuts canier fiamOy 3 (cysOne, dbad 


44 en 




131220 


AdDZji94 




KIAA0977 protein 


lf«Ov 




131430 


Alo79i4o 


Hs.26770 


TatQT aciu nnoing pnxein t, omn 


R in 




132114 


kit! Anc4ei 

NNL00o152 


nS.wZJZ 


{ymphoid-restFfc^ed mOTibranfi protein 




0.10 


132458 




Hs.48965 


nOnio sapiens cuivx ruzi dsm us, aane u 




OmIO 


132647 


kill AftCfttl 


H5.54432 


slalyilransfersse 4B (bela^dadosidase 


7 cn 
/.Oil 




132655 


D49372 


LI* CAACt\ 


^ , ■ 1 ,B IjfcjImJKla Jiul-oi^lm^ ^■ittfgwTWi A 

snd oniucuNe cyiaons suinanujj a 




^.aj 


132662 


AinTTcnn 
AI0773U0 




serologically defined colon cancer anliQ 




9 Ql 


132747 




HS-OO950 


cois, weaiay snnuar to iuaaioou pnuein 




9M 

2.0d 


132812 


RSI333 








3^ 


133337 


AR}85983 


Hs33876 


ESTs 




5.00 


133876 


AL13490& 


Hs.771 


phosphoiylase, glyoogen; Over (Hers (Ss 




100 


134119 


AW157837 


Hs.79226 


fasdculation and etongat^ proton zet 




2.06 


134464 


AA302983 


Hs239720 


(XR4^0T transciipfion ooi^IeXi subunit 




2.27 


134542 


M14156 


Hs.85112 


insuSn-fike growth fector 1 (samatomedi 




11.50 


135002 


AM48S42 


Hs^1677 


G3nSgen7B 


87^ 




135305 


AA203555 


HS.S8286 


Homo sapiens cONA FU14903 lis. done a 




&50 



TAB1£6B shew fie accession numbers for those prirnekeyslacWngurii^ Fdr each pnAeset we have lislfid the gene duster nurnfaer from wh^ 

oBgonudeoSdeswerede^ed. Gene dusters were oorn^ushg sequences dsrh(edtaGenl)ank ESTs and mR^ These sequences were clustered tiased on sequence 
sinriarity using altering ami Afignment Tools {Dou^^ The Genbank accession nurrtosliv sequences comprising eadichsler are Dsted in tte 

*Aoces9on' cdumn. 



Pksy: Unique Cos ptdbesei identifier number 
CAT number Gene duster number 
AocessioR: Genbank accession numbers 



Pkey CAT number Accessions 

10B562 35375 1 AA100796 AR)20589AA074629AA075946AA100849AA085347AA126309AA079311 AA079323AA085274 

103439 35330 1 X98266 N41124 

123551 g8nbai)LAA608837 AA608837 

123S61 ganbanl(uAA620B40 AA620840 

102832 entrez^U92015 U92015 

101972 entr6Z^S82472 S82472 

121558 geitalLAM12497 AM12497 
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TaUe7Ash(Ms93gen85dovn-regulatadihnon«nial^ These genes wae selected from S9660probes8ts on the 

Eos/A^rmstrix Hu03 Genedi^ array. Gene expression for each (vobesdcAt^nedfiDnfiifeena^^ 
the reiaSvs level of mRNA expression. 

PfBf. Unicius Eos fvobesetldenffier number 

ExAficn: Exampbr Accession number, Genbank axessbn number 

UifgendDc Unlsene number 

Unpens THle: Unl|gene gene Ctte 

R1: 90ft pefcentDeofAl far samples to smokers with afenocaitinomadM^ 

R2 90th petoenOe of Alter samples tan sm6kerswiib squamous oeQcaid^^ 
CBRfinonia 



Pkey 


ExAccn 


LM^enelO 




R1 


i^ 


100187 


017793 


H5.78183 


nfflinitytflstoifffifl ^iusivo Cssub) protBln 




154.10 


100380 


082343 


Hs.16551 




77.40 


10(S76 


XD0356 


HS.3705B 


cakftirSn/calcBoiAHeblBdpalypepSd 


102.40 




100971 


BE379727 


Hs.83213 




45330 




101046 


K01160 




fNONQ 


67ZflO 


• 


101066 


AW97Q254 


Ks.889 


C^3ff^m ffytf en c^fs^^ protE^n 


66.00 




101175 


U^l 


Hs.36980 


f '] iPisnfliiTA flntnon fswrSht A. 7 




77.20 


101497 


W05150 


Hs!37034 


homeo box A5 


62.80 




101663 


NM_003528 


HsJ2178 




78.00 




101677 


NM.000715 


Hs.1012 


complement component 4'bindinQ prntemt 


186.20 




101745 


M88700 


fte.150403 


dopa decarboxyiase (aroma& L-anwioeci 


80.08 




101941 


S775B3 




gbi^ERVKIO/HUMMTV iBvefse bansofptasa 


99.20 




102125 


NM.006456 


HS.28B21S 


daiyllransferase 




103.10 


102242 


U27185 


Hs.82547 


refin^c ackI receptor res ponder 0az8ro 


67j00 




102340 


U37055 


Hs27B657 


1 rmH pphaQe sSmutaOng 1 (hepatDcyte qto 


7150 




102369 


U39840 


Hs.a9867 


hepatocyte nuctes fflfrfor 3^ el^ia 




©.70 


102457 


NM_001394 


Hs.2359 


dual sppcifjoly phosp^^fise 4 


153L00 




102869 


U71207 


Hs. 29279 


Qfes absent (DmsopWl^ homdog 2 




65.70 


102796 


A1079646 


He.107D19 


symp^^yKfn^ HunHngtin wtttiHriinQ prok^ 




ssiso 


102829 


NM-OOOISS 


HS.B0962 


neurotensin 




286.80 


103207 


X72790 




gij^Human endogenous lubuAAis mRNA for 


70.00 




103242 


X75342 


Hs.369 


nlnohal rfnhvflnMAfisiSA 7 friass l\A mi a 
oiwiiwi usiijuiujjoiioso f |uaa9 iV|| iini u 

casefait a^he 




212.10 


103260 


X78416 


HS.31S5 




130.70 


103351 


XB9211 




QbiHjsepiens DNA for ertdogenous r^rovlr 


64.60 




104212 


AB002298 


H5.173035 


KlAA0300prot^ 


66.80 




104252 


AF002246 


Hs.210863 


fSii sdndSiDfi fHiwHwifto uAh hflnvwiflv fls 


63.80 




104258 


AF00721B 


Hs.5462 


s^ute earner feRiily 4i sodium UcaAon 


94.40 




105024 


AA1 25311 


Hs.9879 


ESTs 


68.20 




106280 


AI097144 


H5.5250 


ESTs, W^ald^ simSar to AUU1_HUMAN ALU S 




74.60 


106440 


AA449563 


Hs.151393 


fulfil iKn^^«jOiS41lv U^^OOOi I#CIw«jUm ow 




71.10 


1085^ 


BE298210 




Qb'60111801^1 NIH MGC 17Hano8aaieKe 


73.20 




106605 


AW772298 


Hs.21103 


Homo sapiens mRNA; cONA OKF2k>K46076(fr 


S3.80 




106614 


AA646459 


Hs.335951 


hvnflthflfleal nmhwn AP301 272 




62.30 


106654 


AW075485 


Hs.286049 






202.40 


108999 


H93281 


Hs.10710 


hypoOieSca! proton FLJ2D417 




@.60 


108700 


AA121518 


Hs.193540 


ESTs, Moderately similar to 2109260A B e 




66.40 


108810 


AW2S647 


Hs.71331 








108657 


fVWv i*tuo 




iwlllin flTffyeftnMa Orranc tMunnlml su4 

dUUUI VlMUSUfHHo vMopS IHNIIBIUyjji oCt 




era An 


109597 






ESTs 


65.00 




109891 


J15555B 


HS.128S0 


ESTs 




(SR7Q 
yQ.rU 


109704 


AI743680 




ESTs 




60.60 


110942 


R63503 


Hs.28419 


ESTs 


76.40 




111722 


R23924 


Hs.23596 


EST 


74.60 




112891 


J03B27 


H5.293147 


ESTs. Moderatelv similar to A46010 X-O 


64.60 




112992 


AL1574K 


Hs 133315 






76.70 


113073 


N39342 




iiilmolubtilifr^Rsoofllcd piolfiin IB 




120.20 


114251 


H15261 


H5.21948 


ESTs 


127.20 




115230 


AA2783(K) 


He 124292 


Hnfno Rai^AiVi rnNA* Pt 19^19^ fis. riniM 1 
nuiiiw ooptoio vwi w rL«iA*iicw ihIiMWIO u 


174.00 




115291 


BE545072 


Hs.122579 


himnfhpfifgH oroipin FLJ1Q4fi1 




91.00 


115815 


AW9flS32B 


He 180842 


ilbosomd protein LI 3 


66.40 




115909 


AWB79S97 

n V w Iff mttm w 




£STs.WfiflldvBiiril»lDDAP1 HUUAN DEATH 




226.60 


11S96S 


AA001732 


Hs.173233 


hypothefical protein PU10370 


82J0 




116107 


AL133916 


H8.172572 


hypotheOcd pmteln FLJ2CR)93 




351.60 


116552 


D20508 


Hs.164649 


hypofheGca! piot^ DKFZp434H247 


69.00 




116571 


045652 




gb:HUMGS02848 Human aduHhstg 7 direct 


64.20 




116466 


N66741 




gb:yz33g08.8l btorton Fetai Cochlea Homo 




6X50 


120484 


AA253170 


Hs56473 


EST 


81:60 




120983 


AA3982Dg 


HSJ7S87 


EST 




61.10 


121034 


AUB9951 


Hs.271623 


nudeoportn SOkD 




66.20 


121423 


AW973352 


Hs.290565 


ESTs 


64.40 




12^ 


AA451884 


Hs.190121 


ESTs 




60.40 


12S46 


A17ie702 


K5.306026 


ma]Or MstooompalSdity compiex, ctess 


168.60 




123130 


AA487200 




gb3b19i02.s1 Stratagene lung (937210) H 




80.20 


124472 


N5B17 


Hs.102670 


EST 


71.00 




124526 


f;62096 


Hs^185 


ESTs. Weaidy similar to JC732B amino ad 




104.90 


125489 


H49193 


Hs.124984 


ESTs. Moderately MlartoALJLJTJflJMAN A 




72.00 


125731 


R61771 


Ks.26912 


ESTs 




69J0 


125747 


N&L0Q2884 


Hs^ 


RAP1A, member of RAS oocooene fendlf 


69i» 




1^020 


H79B63 


H6.114243 


ESTs 




6140 


12647 


U47732 


Hs.84072 


trensmenfersM 4 8U|ffirtBniy nsmber 3 




62il0 


126966 


R38438 


Hs.ie2575 


sotute center (umi^ 15pf*^jpepGdeb8 




60.10 
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127472 


AA761378 


Ks.192013 


ESTs 


70i0 




127610 


AA960867 


Hs.15a271 


ESTs« KB9^ sli'i^ to unnamed prot^ 


64.00 




127742 


AW293496 


Hs.180138 


ESTs 


8&20 




127987 


AtQ22103 


Hs.124511 


ESTs 


96.60 




12B233 


AW889132 


Hs.11916 


rtboldnsse 




78.90 


128420 


AA650274 


Hs.41296 


SsonecSn Isucine rich trsnsmendvano p 




106.90 


128766 


AW1 60432 


Hs.296450 


ctdtooSaasi devetopment prot^ 1 


66.80 




129014 


AW935187 


H5.170162 


K1AA1357 proton 




58.53 


129215 


AB040930 


Hs.126085 


K1AA1497 prcM 


64.20 




130090 


H97878 


Hs.132390 


zinc finsar protein 36 (KOX 18} 


63J0 




130385 


AW067800 


Hs.155223 


stanniocaban 2 




139.60 


130732 


AW890487 


Hs.63984 


cadhain 13. H^adherin (hearQ 




64.60 


13102S 


AB040900 


Hs£1ffl 


IQAA1467protdri 


64.40 




131241 


BES01914 


Hs24^ 


Homo sapiens cDNA FU1 1 640 6s, dons HE 


7&20 




131775 


AB014548 


H831921 


K1AA0548 protein 


97.80 




132240 


A6ai8324 


Hs.42676 


KIAA0781 protein 




71.00 


132856 


NM-001448 


KS.5B3S7 


0)yp}can4 




88.40 


132977 


AA093322 


H&301404 


RNA btatfing motlF protein 3 


133L20 




133749 


L20852 


H&IOOIB 


sohitB canter fani^ 20 (phosphate (ran 




S9.30 


133818 


AIiiUdo4 


nS./MO 


fibfinoger^ B pdypep^ 


oAi An 




134254 


AF149297 


HS.80S7 


NA&5p^ 




64.30 


1342& 


M83772 


Hs^0S76 


Sandn containing monoos^enase 3 




23Z53 


134346 


X84002 


Hs.82037 


TATA box binding protein (TBP>'8ssociate 






134395 


AA456539 


KS.82S2 


lysosonnal-associated nftembrane protein 2 




75.80 


135047 


AL134197 


Hs.93597 


cyc&Hlependent kinase 5, Fegutatarysu 




108.30 


135056 


N75765 


Hs.93765 


lipoma HMGiC fusion partner 


71.40 




135309 


AI564123 


HS.42S00 


ADPwibosytalion fador-Iiiffi 5 


70.40 





TABt£7B shows the accession nuntos for those primekays lacking For each probesd we hare SstedOia gene duster nurnberimv^ 

oT^mudeoSdes were designed. Gene cfajsters were comp^ using sequences derived torn Genbank ESTs and mRNAs. These sequences were dustered based on sequence 
sinM^ udng Oustering and AliGpunent Tools (Do^ 
'AooesaonT cdunuL 

Pkey: Uniqw Eos prdbesd idenlSer rsuid)^ 
CAT number. Gene duster number 
Accession.' Genbanlc aooessfon numbers 

Picey CAT number Accessions 

103207 3063S.-4 X72790 

106566 120356J BE2g8210A]672315AVV086469BE298417AA45S921 AA9Q2537BE327124R14963AA085210AV\OT4273AI333S84AQ^ 

Ai885095At476470AI2876S0Al8K299AS85381 AW592624AW340136AI266556 AA4S6390 At310815AA484951 

116571 genbanlLD45652 D45652 

118466 genbanlLH66741 M66741 

101046 entrezJK01160KD1160 

101941 entrez_S77583 S775B3 

103351 entJBOC8921 1X89211 

123130 genbanlLAA487200 AA487200 
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TdUa BA sh(nn 1 720 genes efflier up or dowiwe^^ 

GhnBucdtrcSsQasedtifflg samples represent ch^ These genes were setededftom 39494 

pibbe5elsontheBBM^^metrixHuQ2Genech^8n^^ Gene expression data lev eadtprobesetobtaiRedlra^ 
nonnsfized vslue fe&ecQrQ Iho relaSvc level of niRNA esQ!^^ 

Ptey: Unique Eos probesetWenfifierruffnbef 

ExAccn: E^gmpter Accessio n nunnbar, Genbank access i on mgnbsr 

Ui^enelO: Unigene number 

Urigene rifle: Unpens gene Qte 

R1: TOlhperoenQBOfAl for lung tumors dmded by 90&iperoent> 
Fl TOfhperoBnSBOfAlfivchiQnicafydiseasedbngdMdedfc^ 



Pkay 


ExAccn 


UnigendO 


UnigeneT^ 


R1 


R2 


300097 


AI916973 


Hs^13603 


ESTs 


&46 


4.69 


300117 


AW189787 


Hs. 147474 


ESTs 


dSB 


0l56 


300197 


AI886681 


Hs.218286 


ESTs 


428 


&44 


300201 


AQ08300 




QbiafflcOobXl NujCX3AP.Bm20 Homo saplen 


0.62 


(K83 


300225 


At989963 


H5.197505 


ESTs 


1.68 


1.75 


300247 


AWZ74682 


Hs.161394 


ESTs 


1.08 


228 


^0256 


AI469035 


Hs.298241 


Transmembrsne prateasOt ^Ais 3 


0l86 


1.00 


300337 


AI707681 


Hs^2090 


ESTs 


6i0 


9.09 


30Q3S2 


Z4230& 




^)JfSCQFB121 normdizad inteit brafai cON 


4.18 


1Z78 


300374 


AI659947 


Hs.314158 


ESTs 


2.99 


438 


300387 


AW270150 


Hs^54516 


ESTs 


1.50 


2.53 


300440 


A1421541 


Hs.146164 


ESTs 


3.98 


5.25 


300441 


R10367 


Hs,307921 


EST» Waa)dy Mar to Z23^Hll\&AN ZINC F 


3.18 


6.80 


300449 


AI^2957 


Hs. 132221 


hypotheSca! pMn FU12401 


0.43 


0l62 


300469 


AW135830 


Ks.233955 


hyiMiheScal protein FIJ20401 


0.16 


0.83 


300552 


XB5711 


Hs^1B38 


hypothsfical protein FU11 191 


4.10 


9l75 


300627 


W27363 




^3t>37dOl4l Strslsgene HeUoeD $3 93 


4.60 


1160 


300630 


AW1 18822 


Hs.128757 


ESTs 


2.91 


&86 


300716 


Am6113 


Hs.126280 


hypotheSca) protein FLJ23393 


in 


0l92 


300738 


AI623332 


Ks.1 30541 


KIAA1542prot^ 


1.62 


1.71 


300777 


AA235361 


Hs.96840 


KIAA1527 protein 


4.48 


&22 


300790 


AM92471 


Hs.188270 


ESTs 


1.29 


1.18 


300832 


AIKB147 


te.220615 


ESTs, Weakly simflar to T03829 transcrfp 


6.51 


6.58 


300836 


Z44942 


Hs^2958 


calcium channel alpha2-delta3 subonit 


4.90 


&34 


300838 


A15B2897 


Hs.192570 


hypoiheQcal protein RJ22028 


1.70 


2.81 


300878 


AW449B02 


Hs.285901 


Honio sapiens cONA FU20428&. done KA 


4,56 


7.91 


300897 


AI890356 


HS.127B04 


ESTSt Weakly sinAar b T1 7233 hypotheS 


to 


1.58 


300926 


AA504e60 




gb:ab03a1 0.81 Stratagene fetal refina 93 


2.13 


150 


300960 


At041019 


Hs.152454' 


ESTs 


2.74 


4.46 


300961 


AW204069 


Hs.312716 


ESTSy Weakly ^ndlar to unnamed protGbi 


1.00 


1.00 


300982 


AAS83373 


Hs.293744 


ESTs 


1.46 


1.51 


300987 


AA565209 


Ks^69439 


ESTs 


0.39 


1.30 


300987 


AW450840 


Hs.148590 


ESTs, Weakly sImOar to AF208846 1 BM-00 


1.49 


1.08 


300988 


AI927208 


Hs^08952 


ESTs 


0.16 


0.37 


301050 


AW136973 


H5.288516 


ESTs. Weakly similar to S69890 mitogen 1 


3.23 


1.94 


301038 


AA677570 


Hs.1 85918 


ESTs 


6.76 


14.28 


301157 


AA729905 


H5^31916 


ESTs 


3.16 


&85 


301162 


AI142118 


Hs.129004 


ESTs 


1.68 


7.18 


301170 


AA73^94 


Hs^47606 


ESTs 


4.40 


6.42 


301192 


AI808751 


Hs.121188 


ESTs 


6.38 


11.59 


301193 


AA758115 


Hs.1283^ 


ESTs, WeaMy sbnOar to JG5423 2-liydraKy 


4.35 


7.78 


301267 


AW297762 


Hs.255690 


ESTs 


\JSB 


1.61 


301281 


AA843986 


Hs.190586 


ESTs 


2.19 


1.78 


301341 


A1819198 


Ks.208229 


ESTs 


0.76 


0.76 


301382 


AA912B39 


Hs.163369 


ESTs 


\X0 


1.61 


301407 


AW45a466 


K5.126d30 


ESTs 


1.48 


1.51 


301452 


AAS75688 


Hs.1 59955 


ESTs 


0.51 


1.46 


301483 


AWZr2467 


H1254655 


UnOBed 


2.40 


&Q2 


301494 


AI678034 


Hs.131099 


ESTs 


2.79 


3.41 


301521 


AI733621 


Hs.133011 


zinc finger protein 117 (HPF^ 


a67 


0.67 


301531 


AI077462 
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336659 


0.60 


1.31 




336675 


0l31 


1.18 


35 


336684 
336694 


1.50 


1.14 


4.74 


7.10 


336716 


4.43 


6.37 




336721 


Z20 


0.74 




336738 


1.64 


2.14 


>IA 


336900 


6l14 


12.73 


336948 


1.00 


4 AA 
1.00 




337028 


1.30 


2.09 




337043 


4.01 


1153 




337046 


1.67 


1.84 


45 


337054 


278 


7.35 


337128 


7.20 


18.14 




337162 


3.45 


5.34 




337183 


5.72 


11,41 




337184 


3b72 


5.90 


50 


337192 


177 


1.06 


337194 


1.88 


1.68 




337229 


Ol22 


1X)3 




337268 


1.00 


3.31 




337299 


3.23 


5.14 


55 


337325 


2.76 


3.72 


337389 


5.80 


10.42 




337493 


106 


6.30 




337497 


7.88 


20.29 




337500 


3.80 


4.48 


60 


337549 


t.66 


2.31 


337603 


1.27 


854 




337605 


5.76 


7.16 




337671 


0^73 


0.97 




337755 


1.54 


0.92 


65 


337786 


5.07 


9.73 


337809 


6.18 


12S7 


337862 


3^78 


1Z97 




337871 


2.66 


8.16 




337958 


0.26 


1.34 


TA 

70 


33S108 


1.48 


1.12 


338033 


2.38 


1459 




338083 


0l65 


Z16 




338110 


1.00 


1.61 




338112 


SJ8 


8.S 


75 


336145 


1.70 


1.97 


338148 


8.07 


18.19 




338158 


1.30 


455 




338161 


2.58 


357 




338179 


1.00 


1 m 

I.W 


80 


338182 
338189 


3^2 


463 


1.00 


334 


338197 


a99 


1.69 




338199 


4.58 


7.62 




338215 


6.01 


1555 


85 


338279 


as3 


0J5 


338316 


20i8 


3&66 
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338357 
338359 
338386 
338374 
338414 
338418 
338469 
336501 
338506 
333523 
338549 
338561 
338662 
338671 
336676 
338726 
338779 
338804 
336838 
338871 
338872 
338879 
338937 
338966 
338993 
339047 
339100 
339114 
339121 
339170 
339229 
339284 
339293 



3^ 
4.10 

iai2 

0.69 
0.40 
0.47 
&12 
3.09 
&28 
6.97 

aio 

1.70 
0.79 
1.72 

ai7 
zio ■ 

1.20 

ai2 
a99 

1.00 
4.30 
5.02 
023 
6.55 
1.76 
1.00 
5.2S 
5.10 
1.00 
1.00 
10.38 
4.08 
Z64 
1.73 



7.39 

11.39 

2159 

1.02 

1.18 

1.06 

13.86 

5.11 

10^2 

12.41 

5.84 

Z70 

0.61 

1.46 

0.91 

15.88 

1.09 

0.57 

1.67 

1.00 

9.81 

12.61 

1.12 

12.26 

5.42 

Z40 

10.81 

&B8 

1.70 

a75 

19.67 

13.48 

3.63 

1,94 



TABL£8B shovre Ihe secession numbers for those h Table 8A iacUng uni^ Per each probesetvre have listed (he gene duster luunberf^ 
d^onucIsoSdes were des^ned. Gene dusters were com;Nled using sequences derived from Genbank ESTs and mRNAs. These sequences were dusiered based on sequence 
Marlty using Clustering and Alignment Tools (DoubleTwist, Oakland Califomi^. The Genbank aooessbn numbers for sequences comprising each duster are listed in the 
'AcoBssfon' cohnrai. 

Ptey: Unk;ue Eos probeset identifier number 
CAT number. Gene duster number 
Accession: Genbank accession numbers 



Pkey 


CAT number ^cessions 


322044 


187363_1 


AW340926 AA249063 N86075 


322060 


44320J 


Ai341937 AW003063 U34725 AA904742 


321430 


42705J 


X57414 X57415 


321467 


43034.1 


X13O75X13076 


322125 


45779_1 


R93901AF0750nR93902 


322166 


46861_1 


H69434AF085956H69846 


322173 


46873.1 


H5^7 H52557 AFD85970 H52164 


322178 


46882.1 


H56535AF0e598OH55712 


322179 


46885.1 


H92891 AF085982 K92777 


321577 


1615102.1 


H84849 H64252 H84260 H86664 HK320 


321567 


1615333.1 


H95531 H95521 H84529 


313723 


111953.1 


AA070412 AA102346 AA081B85 


320997 


627492.1 


H22544 H46842AI204929 


322278 


47271.1 


W69304 AR}86283 W69200 


321687 


218439J 


AA625149 AA313030 AA313Q52 K97463 


313883 


129439J 


AA665089AA135130AA484(£9AA102419AW877765 


322320 


4742^1 


W79150AF086419 




814584.1 


AI868646AI734214W17348 


314648 


293660.1 


AW97928SAA878419AA431342AA431^ 


300201 


682222.1 


A1308300AI306296 


308897 


25196.^ 


AI093967 


323155 


979809.1 


AL120701 AL135041 AL121524 


32;S27 


38927.1 


AF147359 T58511T58560 


322585 


473768.2 


W88919 W69125 


300362 


1574395J 


242308 H23514 


322835 


82298J 


AA005129 AA679084 AA694399 


322664 


8504^.1 


AA011522 AA702841 AA011691 AA330797 


315454 


380580.1 


A1239454 AI239473 AA625812 A1208703 


32^ 


37372.1 


AF074666 A1110759 AFO90902 


314852 


327472.1 


AI903735AM91263AI694953AW976803AA761382 


307783 


©7809.1 


At347274AW844024 


324072 


ffi9032.1 


AA381722 AA381829 AW963908 AW963902 AA381242 


300627 


221345.1 


AA488472 W27363 AA317053 8E0826B9 AWg67035 BE079872 


323505 


196389 1 


AWg70512 AA280251 At652287BE456438AI650725AA551854AA281574AW5714B1 


315791 


403558.1 


AA678177AA677034 


324303 


233842.1 


AL118754AA333202H38001 


316519 


442865.1 


AA847835AA76B378 


300926 


333127.1 


AAS04860AA504911 
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324580 
301B82 
324804 
324889 
302697 
30Z711 
302742 
318499 
310624 
302847 
304122 
3035S8 
31140 
312034 
319312 
319407 
31942S 
320007 
320018 
319484 
318865 
312220 
319546 
312389 
319611 
312437 



311696 
319834 
321102 
321158 
321199 

305526 
321270 
314126 
320714 
306442 
306446 
306458 
306510 
305557 
30S572 
308582 
306656 
306666 
306751 
308011 
306892 
308106 
308154 
306956 
306958 
308213 
308216 
308219 
308588 
308599 
308643 
308673 
3Q8g7 
308778 
30^08 
30K75 



308966 
308979 
303011 
303077 
305016 
305034 
305072 
305148 
305190 
303978 
3039^ 
303998 
303999 
3D5235 
305312 
305413 
305447 
321244 



275087J 
398093.1 
1515978J 
43219.1 
45419J 



364430J 

34824 4 

458.105 

77271_-5 

270283.1 

837254.1 

797889.1 

1540116.1 

1688823.1 

1689571J 

2296B3L1 

1815987.1 

1691553J 

1535937.1 

1671607J 

24330SJ 

902067.1 

1566863.1 

291472.1 



579192^1 

112523J 

80531J 

410938.1 

212379.1 

2883^-3 
1562057.1 
177666.1 
743644^1 
AA976899 
AA977348 
AA978t86 
AA988546 
AA994530 
AA995686 
AA998248 
AI004024 
At015515 
A]032589 
AI439473 
AI092465 
AI476803 
A)500600 
A1125111 
AI1251S2 
AIS7041 
AI557135 
AI557246 
Ar7t8299 
A1719893 
Ar745040 
Ar760864 
AI767143 
A1811109 
AI818289 
A1832332 
AI833240 
AIB56845 
AI870704 
AI873111 
41689.1 
44 060 J 
AA626676 
AA630128 
AA641012 
.M654070 
AA565955 
AV\G13315 
AW15465 
AW16449 
AW516611 
AA570480 
AA70Q201 
AA724659 
AA737856 
29327.1 



T780S4 i /9B88 AA398185 
AIQS52AI393343AI800510AI377711 F24263AA661B76 
O31010D3(^ D31168D31166D31465 
AJ001409Aj001410 
UI8442D51348 
L12061 

T25451 AA5852g5AA585305 

U88898 UB8898 AA91 6056 T03285 At341 594 AI3S9S34 AI634031 U88897 

X^l X38942 X98943 X98953 Xg8949 

H26966 

AA^14 AA402411 AA412355 
AlS8839Aig09260AI909259 
278390 197^ 
Z45481F12393T74437 
R05329 R01555 R08276 
T82930R02424TB5145 

AA336314T82938AA327744AW967388AA639967T10753 

T83263TB5731 T85730 

T91772R07257 R07098 

H10818F07831Z43072 

N74613T98756 T98589 

R09692R09414AA348353 

AI883140 W80703 R43474 

H14957 R56522R11908 

BE080180 AWa27313 AW231970 AA995028 AA428584 AW872716 AW8925ra AW854593 AA578441 AW975234 AA664937 AA984131 

AA52S743AA5S2874AA564758AVV063245AI267S34AW070190AVV893483AA770330AA906928AA90&562M 

AW063311AA429538 

AW206447 A1248530 AI084433 AI400976 R16553 
AA071267 TK940 T64515 AA071334 
AA018306K38925AA001221 
H79670 H47798AA700289 

N34524AA305071 AVVg54803AA502335A)433430At203597AW026670AW265323AW850787AA31^AW99364^ 

AW385512At334966 W32951 H52656 H53902R88904AW835732 

AA769156 

N59537 N7a278RB3550 
AA226431 AA226569 AA488748 
R91B83AI445591 



AF090405 AF090407 AF090406 
AF163305AF163307AF163303 



APQ68654AF0686S6AFD68655 



145 



wo 02/086443 





AA7828B6 




AA806124 






30^50 


AARQ77Q9 


3^90 


AA813477 


3(^28 


AAfi2a209 


305759 


AAfi^53 


305792 


AA84S256 


307041 


AI144243 


307091 


Ai167439 


307181 


A1189251 


305^1 


AA872968 


305910 


A/^5981 


307415 


A1242118 


307426 


AI243364 


30^17 


AI275055 


30^51 


AI2B1556 


307561 


A!282207 




A1S0295 


307691 




307730 


Alt3fi09? 


307760 


AI342387 


307754 




307796 


A^50556 


309045 




309051 


A1911975 


307807 


A1351799 


307608 


AI351826 


307820 


Af355761 


307B52 


AI385541 


309122 


AQ2fi178 


309164 


AI937761 




AI951118 


307902 


AI380462 


309299 


AWD03478 


309303 


AWm4823 


309476 




309532 


AW1S1119 


309747 




309769 


AW27234fi 


32^99 




3098^ 


AW299916 


302679 


311853 1 HfiS022AA1 86889 


309923 


AVV340684 


309928 


AW34141R 


3(^31 




309933 


AW34193fi 


2)2705 




302789 


\&1Bi 1 AI745n(r7 AJ74'>n70 


3040(K 


AUR17947 


304024 




304026 


TO3160 


304028 


T03266 


304046 




304061 


T61S21 


304063 




302802 




304114 




304155 


H66696 


304203 




304234 


VV81608 


3)4348 




304430 


rvWrf OOA 


304456 




304521 


AA464716 


304526 


AA476427 


304607 




304735 




304760 




30^15 




306083 


AA906316 


308065 


AA90fi725 


308104 


AA910956 


308109 


AA911661 


306242 


AA932805 


306288 


AA936900 


306396 


AA970223 


330568 


NOTJH}UND.entrez U56244 


330599 


15323_-12 U90437 


331131 


g8nbanJLR54797 R54797 


331203 


NC3T.R}UN0.enirez T82310 


331531 


SenbaRlLN51343 N51343 


331547 


46739GL1 AAB28597 H54811 


332074 


g8nbarilLMS99012 AASS9012 
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TASlESCslxiKthegenarncpostaibrfliosef^eysinTab^ For eaAjgs&is^eim^wBbs^eJSstodl^gisi^ 

sequence eomce used for preificfion. NudecfidelocaSons of each (ve&tedexonae also listBd. 

Pissf. Unique numtar C(VTSspoiu£hg to an Eos pnobasst 

Ref. SequsnoBsooroe. The7dig}tfunri)e7stofMsooiunmaraG6nhankto enOOed The DMA 

sequence of human dtfomoeame 22.' Dunham LetaL. Nature (1999) 402:489^95. 
Straddlr buSbates ONA strand ftom which esGons were predlcietf. 
NLpo^iixc tncfcaiBS nudeoUdo powlions of precfcted axons. 



Ptey 


Ref Strand 


NLiKsffion 




332792 


Dunhain,Let^ 


Rus 


73381-73768 


332816 


Duimam, LeLaL 


Phis 


35w44-360iu0 


332906 


Dunham, I. et^ 


Phis 


1923101-1923205 


332911 


Dunham, L elai. 


Phis 


1961 767-1 96)858 


332912 


Dunnaia L eLaL 


Phis 


19621 2D>1962246 


332922 


DunhanuLaul 


Plus 


200962O-20ra738 


332956 


tXinham. Lel^ 


Plus 


AeJfifAA 4fcg4iwgti 

S1052frS1065B 


332959 


Dunham, L eLal 


Ptus 


AP4044f f>f4A44«^ 

251B145-S1B213 


3331^ 


Dunhara 1. etaL 


Phis 


3369205-3369323 


333139 


DunhanxI'Gt^ 


Phis 


4MM%Jnr «M^A^^4 

3369495-3389571 


333221 


Dunham, I eLaL 


Plus 


3978070-3976167 


333380 


Dunham, 1. et^aL 


Phis 


4904775-4904845 


333387 


Dunham, I. etaL 


Phis 


4910935-4910997 


333512 


DunhanvLetaL 


Plus 


556051 0-S60564 


333524 


Dunham. LeLaL 


Plus 


5612620-5612780 


333585 


Dunham, L etaL 


•Phis 


6234778-6234694 


333818 


Dunhara LeLaL 


Phis 


6562391-6562586 


333627 


Dunham, t. eLaL 


Phis 


6620584-6620903 


333628 


Dunham. L eiaL 


Plus 


6629004-6629233 


333650 


Dunham, I etaL 


Plus 


6796852-6797128 


333878 


DuiUiaia L etaL 


Phis 


7068223-70682^ 


333750 


DunhanvLti^ 


Phis 


7608165-7608234 


333763 


Dunham, L eLaL 


Pius 


7692491-7692630 


333767 


Dunham. L etaL 


Phs 


7694407-7694623 


333768 


Dunham, L eLaL 


Phis 


7695440-7695697 


333769 


Dunham, L etat. 


Plus 


7696825-7696707 


333772 


Dunham. L etd. 


Phis 


7706773-7706902 


333777 


Dunham. LeldL 


Rus 


7746805-7746916 


333846 


Dunham^ L eLat. 


RUs 


80066234008757 


333884 


Dunham, Letal. 


Phis 


81539604154161 


333887 


Dunham, L etal. 


Phis 


8154882-8155025 


333891 


Dunham. L etai. 


Rus 


81564374156709 


333892 


Dunhami Letal. 


Rus 


8156B2S4157001 


333948 


Dunham. L etal. 


Phis 


8583497-6583627 


333954 


Dunham, 1. etaL 


Rus 


6563166-6563335 


333966 


Dunham, L eLal. 


Rus 


8655643-8655826 


333968 


Dunham. L etat. 


Rus 


8661004-8681241 


334061 


Dunham. LeLaL 


Rus 


9686941-9687077 


334094 


Dunham. Letal. 


Rus 


9889953-9890105 


334113 


Dunham, L etal. 


Rus 


1Q282459-1028S97 


334161 


Dunham, L etaL 


Rus 


10599033-10599180 


334219 


Dunham, L etaL 


Rus 


12716160-12716384 


334239 


Dunhanv L etai. 


Phis 


13056569-13056693 


334333 


Dunham, L etaL 


Rus 


13603544-13603657 


334378 


Dunham, L etaL 


Rus 


13907239-13907370 


334382 


Dunham, LeLaL 


Rus 


13915866-13916036 


334562 


Dunham, L etai. 


Rus 


14987847-14987940 


334568 


Dunham, Let^. 


Rus 


15032740-15032817 


334616 


Dunham; L etai. 


Rus 


15176123-15176470 


334633 


Dunham, L et^ 


Rus 


15333206-15333305 


334866 


Dunham, L etaL 


Rus 


18872214-18872317 


334691 


Dunham. L etaL 


Rus 


19299770-19^9944 


334934 


Dunham, Letal. 


Rus 


20103970-20104058 


335015 


Dunham, L etd. 


.Rus 


20682792-20682946 


335120 


ObRham. L etaL 


Rus 


2143628&214%384 




uunnam. LeiaL 


Rus 


21441390-2144i471 


335179 


Dunham, L etaL 


Phis 


21634405-21634526 


335188 


DunhanvLetaL 


Rus 


21689118-21^9328 


335211 


Dunham. L etaL 


Rus 


21774611-21774680 


335361 


Dunham, LeLaL 


Rus 


22807292-22807445 


335379 


Dunham. LeLaL 


Rus 


22899306-22899420 


335414 


Dunham, L etaL 


Rus 




335416 


Dunham. Letd. 


Rus 


23237354-23237465 


335496 


Dunham,LeLsL 


Rus 


24164386-24164545 


335497 


DunhanvLetaL 


Rus 


2416766fr-24l67869 


335558 


Dunham. L etaL 


Rus 


24740167-24740347 


335586 


Dunham, LetaL 


Phis 


2499033^4990497 


335686 


DunhanvLetd. 


Rus 


2439839-25439920 


335784 


Dunham, LetaL 


Rus 


25942710.S942792 


335823 


Dunham, LetaL 


Phis 


2636592&-26366(m 


335983 


DunhanvLetaL 


Rus 


27938968-27939070 


335995 


tkintiam, L etsL 


Rus 


28009044-28009184 


336021 


Dunham, L etaL 


Rus 


2868648^686559 



147 



wo 02/086443 






336034 


OunhaRiiLeLsL 


Rus 


29014404-29014590 




Ounhan^LeUL 


Pho 


29Q2KfiS29tC31 fiS 


336107 


Dunhanv I elal. 


Plus 


299B7731->29987K9 


336632 


OimhanvleUi. 


Plus ' 


983890>S85529 


336633 


Ounhanx t. elaL 


Phis 


98a91>98&221 


335S34 


Ountem, L etaL 


Phis 


986296-986670 


336635 


Dunham, L etat. 


Plus 


987908-988354 


336636 • 


Duimam, L eLaL 


Plus 


988418-3891K 


336637 


Dunham, I 


Plus 


98927&-990813 


336638 


Dunham. K elaL 


Phis 


^1906^40 


336669 


Dunhan, L etsi. 


Rus 


1898402-1895476 


336694 


Dunham,!, etal. 


Plus 


2420546-2420616 


336721 


Dunham,). eLal. 


Plus 


3371522*3371586 


336900 


Dimhamb L etai. 


Phis 


10236423-10236523 


336948 


DunhambLeLal 


Phis 


12692290-12692381 


337028 


Dunham. I. eta). 


Phis 


16644817-1 6644942 


337054 


Dunham, L elai. 


Plus 


17821742-17821922 


337162 


DunhamiLeLal. 


Phs 


23478343-23479145 


337183 


Dunham^LeLaL 


Plus 


23943605-239436S6 


337184 


Dunham. I etal. 


Phis 


23973949-23974016 


337268 


Dunham. L etal. 


Phis 


28011979-28012034 


337299 


Dunham, L etal. 


Phis 


29022656-29022775 


337389 


Dunham,!, etal. 




3140150^1401579 


337493 


Dunham, 1. etal. 


Plus 


33330760-33330381 


337549 


Dunnanx L etaf. 


Phis 


34474472-34474531 


337755 


Dunham, 1. eta!. 


Rus 


3971764-3971900 


337809 


Dunham, I. ^.al. 


Phis 


4449069-4449193 


337871 


Dunham, I etal. 


Rus 


5443027-5443101 


337958 


DunhanvLetal. 


Phis 


6969162-6969270 


338008 


Dunham,!, etal. 
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matrix metaSoprotelnase 10(slrBme^fdR 


13Z45 


4.00 


400298 


AA032279 


Hs.61635 


sbc tiansmembrane eptheDa! anfigen of 


43.88 


74.00 


400301 


X03635 


H5.1657 


estrogen receptor 1 


1.00 


1.00 


400303 


AA2427S8 


Hs.79136 


UV-I pro^n, estrogen regular 


1.75 


1.65 


40032B 


XB7344 


1^180062 


transporter 2, ATP-btndIng cassette, sub 


Ol87 


1.80 


400419 


AFQ84545 




Target 


156l55 


^00 


400S12 






NMl030978*:Homo sapiens cytochnme P450, 


1.00 


2.00 


400517 


AF2423B8 




lengsin 


3.67 


87.00 


400560 






NIi4.030878*iiomo sapiens cytochrome P450. 




1.00 


400664 






NM.002425:Komo sa|riens malrac metaSopro 


20.26 


45.00 


400665 






NMj002425Mjnio sapiens mabix nntsdlt^iro 


1.36 


1J)7 


400656 






NM.002425dtomo sapiens matrix metaliopm 


3.26 


122 


400749 






NMIo031 05*:Homo sapiens sortSn^elated 


1.00 


91.00 


400763 






Target Exon 


7.63 


24.00 


401027 






Target Exon 


1.00 


1.W 


401093 






C12000588*:gQ6330167|dbf|BAA86477.1| (A 


1.00 


155.00 


401203 






Target Exon 


1.00 


88.00 


401212 






C1200045r:gf|7512178Mr30337 polypr 


1.00 


400.00 


401411 






EN8P00000247172*im>OTHEnCAL 126.2 kOa 


1.00 


72.00 


401435 






C1400039r:giI7499898ipif|IT33295 hypolh 


1.00 


64.00 


401464 


AF039241 




histonedeacdylase5 


3.82 


49.00 


401714 






ENSP00000241802*KX)NA iU11007 RS. aON 


2.02 


40.00 


401747 






Homo sapieie ieratin 17 (KI^IT) 


12a43 


68.00 


401760 






Target Exon 


1.74 


35.00 


401780 






NM.C05557*:Homo sapiens kerafin 16 (fbca 


26.47 


10.50 


401781 






Target Exon 


10^ 


4.61 


401785 






NWLDOZZ/s^cmo sapiens nBraun lo (isKi i 






401797 






Target Exon 


1.44 


Z10 


401961 






KM_021626:Homo sapiens serine carboxypep 


1.41 


1.88 


401985 


AI=Q53004 




dass 1 cytokine receptor 


1.00 


177.00 


401994 






Target Exon 


61.84 


47.00 


402075 






ENSP00000251056*:Piasma membrane calcium 


1.00 


1.00 


402260 






NM.00143e*:Honio sapiens Mlarin(FBL 


1.58 


1.39 


402265 






Target Exon 


2.09 


35.00 


402297 






Target Exon 


1.00 


mo 


402408 






NM.030920*itomo sapiens hypoMcal pro 


28.87 


13.00 


402420 






C100O823ngl|10432400jemb|CAC1029ail (A 


1.00 


1.44 


402674 






Target Exon 


7.44 


243.00 


402802 






MM.001397:Homo sapiens endotheDn conver 


1.00 


70.00 


402994 






NM^X2463*:Homo sapiens myxovirus (Influ 


1.37 


1.43 


403137 






NM.0053Br:Homo sapiens nixieoBn ijkCL), 


1.00 


19.00 


403306 


NKL006625 




transmembrane prot^ (63kO), endopiasnd 


1.00 


43.00 


4033S 






Target Exon 


1.00 


61.00 


403381 






ENSP00000231844^Ecotroptc vims Integra 


1.00 


iiaoo 


403478 






Nli^022342Homo sapiens kinesb pntein 9 


ai3 


isaoo 


403485 






C3001813*:git12737279trBfpCP_012163.1| ic 


20.23 


76.00 


403627 






Targ^Exon 


6.30 


2933 


403715 






Ta^etExon 


1.30 


35.00 


404044 






ENSP00000237855-£J398G3^ (NOVEL PROTEI 


1.00 


54.00 


404076 






NM_01 6020*^^10 sapiens CGI-75 protein ( 


14.29 


91.00 


404101 






C8000950:glK235a}|piri|A47318 i^tindi 


1.00 


1.00 


404140 






Nli^00651 QiHomo sapiens ret finger prolei 


1.42 


1.44 


404165 






ENSP(X)000244562:NRH dehydrogenase {qtito 


1.00 


54.00 


404185 






Target Exon 


1.00 


117.00 


404210 






M^ODSggjOTO sapiens myeloMIWMM 


5.93 


ia77 


404253 






NMjQ2109*Mimo ssftens Hffl Idsbne fiarni 


1.00 


1.00 
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4042S7 






C6u01909;SiI7u4441 (atpAAl 8S09.1) (D298 


29.71 


^00 


404^ 






C6001 238*:^ 1 21 71 5)spIP26697iGTA3_CHICK 


1^ 


1.00 


404347 






Target Exon 


1.00 


1.00 


404440 






NM.azi04&KQ(no sapfens meiatoma an^en, 


i.Uu 


15.00 


404^ 






NM.005598*:Homo sapiens nuclear factor 1 


4 tm 


60.00 


404794 


MMJOOOOTo 




cholfistsfyl e^er Iransto pRri^bv fUss 


1.07 


138 


404654 






TaiQetExon 


I.D1 


201 


404877 






NM„0053&5:HQfno sapbns (ndanoma an^jeix 


4 itn 

I.W 


IjOO 


404927 






TargeiExon 


1.(X) 


1.00 


404936 






Target Exon 


4 fin 
t.W 


1.00 


405449 






ovAftnn>r7v<Jt4 4ii4'?ooiiimRvo /M&ma4i« 
CYuK;U47^^11 14^ridMgeip(r JHB3^1 1 Z 


I.UU 


4 M 


40S6B 






NM_031413*:Hoino sapiens cat eye syndrome 


4 nn 
1.00 


78.00 


405572 






Target Exon 


(LrO 


1.14 


405646 






C1200Q200;gi}4557225|r6ilNP.000005.1| a 


1.01 


1.28 


405676 


BE336714 




cytodvome^l 


1.13 


2.89 


405770 






NM_00236ZHomo saptans melartoma anSgen, 


45.52 


37.00 


405932 






C1500030&g])38061^b]AAC6919&1| (AFD 


1.99 


1.99 


406137 






NM_0uO179^Homo sajxsns mu^ (c. coa) h 


2.77 


2.38 


406380 






Target Exon 


4 MX 

I.UU 


35.00 


406399 






NM.0031 22^Homo sapens serine pfolease 


1.00 


39.00 


40B467 






TargeiExon 


1.00 


1.00 


405621 


nam 
X57809 


nS.1811a 


immunogMxiSn lambda locus 


4 44 
l.4l 


1.74 


40^42 


AJ249Z10 




gD.nsno s^sens mKNA tot unrnunogiooifijn 


4 4fi 


3l91 


406683 


UZ4oBj 


H5.293441 


Inununogblxdln heavy constant nui 


2.Uf 


ZS3 


406671 


AA12^7 


HS.2B5754 


met pmttHjncogens (hepatocyte growth fa 


15.00 


51.00 


40^73 


M34996 


HS.1S8253 


nujlor histocon^sQUUty complex, ctess 


U.90 


3.09 


408676 




ns.B1221 


Human L2-9 transoipl of unrearranged bn. 


. i.oU 


4 C4 

1.59 


406676 


U77534 




gbiHuman done 1 A1 1 immuno^otKJsn vana 


4 44 

1.33 


1.45 


406685 


M18728 




^:Huin3n nonspedfic crossreac&ng aniig 


1.46 


2.85 


406687 


M31126 


Hs^72S22 


pr^nancy specific bet&-1-s4ycoproteIn 9 


D £4 
S.D1 


6.50 


406690 


M29540 


Hs.220529 


cardnoentoyonic anIigeiHelated cell ad 


226,37 


^00 


406698 


X03068 


Hs.73931 


msQor tustooonpaOKri^ complex, dass 


1.01 


Z52 


4(^15 


AAB33S30 


Hs.288036 


tRNA isopentenytpyrc^ihosphatB transferas 


20.25 


32.00 


406851 


AA609784 




major hisloconipafiUBty complex, dass 


0.75 


1.91 


408964 


M21305 




gb:Hisnan alpha satelltta and satellite 3 


38.15 


1114i)0 


40367 


M24349 




gfitnuman paratnyroia noniDne-ius prooa 


1.00 


1.00 


40ra74 


M57293 




gb.'HtuT^ paraOiyretd hormone-related pep 


1.00 


1.00 


407103 


AA424a81 


Hs.^6301 


hypomenca prolem MGC13170 


1.77 


1.10 


407128 


R&3312 


Hs.237260 


EST 


1.00 


1.00 


407137 


T97307 




g^:yB53h05j1 Sosres fetal Over spleen 


142.70 


135A) 


407168 


R451^ 


H5,117183 


ESTs 


2.16 


18.00 


407239 


AAO76350 


Hs.67846 


leukocyte immunoglobuEn-like feceptofi 


1.10 


1.57 


407242 


M18728 




gtjiHuman nonspecffic crossreacSng anSg 


1.12 


2.65 


407244 


M10014 


Hs.75431 


fibrinogen, gamma po)yp^)lid8 


3.24 


15.38 


407289 


M135159 


Hsi03349 


Homo sapiens cDIM 11J12149 lb, ckme lyiA 


3.53 


3.68 


407300 


AA102616 


Hs.120769 


gb2n430O73l Stratagene HeLaceD s3 93 


19.74 


73.00 


407366 


AF026942 


Hs.271530 


gbiHomo saptens dg33 mRNA, parfiai sequ 


0.(S 


8.25 


407378 


AA299264 


Hs.57776 


ESTs, Moderately shnSar to 136022 hypot 


1.00 


26.00 


407430 


AF169351 




gb:Homo sapiens protein tyrosine phospha 


1.00 


25.00 


407453 


AJ132087 




^:Hamo safriens flnlQJA for axonemal dlyneln 


1.00 


75.00 


407577 


AW131324 


Hs.246759 


tiypoOieScai prot^ MGC12S38 


1.00 


1.00 


407634 


AVVu1w69 


> 1_ J n^J4 J 

Hs.136414 


UDPHScNAcbetaGai beB>1,3-N-aQstylgluc 


111.20 


228.00 


407710 


A t Aifto T 

PCmBZfTl 


Hs.2%16 


ESTs 


1.00 


28.00 


407720 


AB037776 


Hs.38002 


AAA L^i— 

K1AA1355 protein 


1.89 


1.31 


407746 


AKK)1962 




hypo&tetica! prot^ FLJ11100 


1.00 


1.00 


407756 


AA1 16021 


Hs,38260 


ubiqidGn specific protease 18 


4.51 


&00 


407758 


DS0915 


ns.38365 


K1AA012S gene produa 


1.00 


28l00 


407782 


AA608956 


Hs.112619 


ESTs, Moderately similar to PURKINJE CEL 


0.97 


1.14 


407788 


BE514982 


Hs.38991 


S100 catdum-binding protein A2 


7.88 


3.83 


407790 


Ai027274 


ns.Zoo941 


Homo sapiens ciJNA FU14ooo ns, clone PL 


3.63 


4Z00 


407811 


AW190902 


Hs.40098 


cysteine knot superfianiiy 1, BMP antagon 


89.96 


109.00 


407839 


AA045144 


Hs.161566 


ESTs 


17191 


108.x 


407944 


R3400B 


Hs.239727 


desmocaiIIn2 


111.30 


70.00 


408000 


LI 1690 


Hs.620 


bullous pemphigoid antigen 1 (23Q/24QkO) 


151.17 


6.00 


408031 


AA081395 


Hs.42173 


Homo ^ens cONA FLJ10366 fis, done NT 


9.91 


33.00 


408063 


BE086548 


Hs.42346 


caldneurirhbinding protdn cs3sardn-1 


195.78 


231.00 


,408070 


AW1488S2 




{JbadOSdOSjcl NQ.CGAP_Bm35 Homo sapien 


1.00 


1.00 


408101 


AvV96flo04 


Ks. 123073 


UA^HBiatEa prOOT Knase r 


37.64 


61.00 


408122 


A1432652 


Hs.42824 


hypotheficd priotein FU10718 


0.85 


1.71 


408212 


AA2975o7 


Hs.43728 


hypolheScal prot^ 


Si88 


7J1 


408243 


Y00787 


Hs.624 


interleuldnS 


427 


9.98 


408349 


BS46947 


H5.44276 


homoofaoxCIO 


3.79 


146 


408353 


BE439B38 


Hs.44298 


nnltochondrial iftosomal protein 317 


1.88 


1.65 


408354 


AI382803 


Hs.159235 


ESTs 


1.00 


73.00 


408369 


R38438 


Hs.182575 


solute caiiier buiiBy 15 (H777 transport 


1.41 


16.50 


408380 


AF1Z3ioO 


Ks.44532 


diut^qmSn 


15.19 


37.22 


408482 


NM_w0o7d 


Hs.45743 


adenosine A2b receptor 


1.K 


1.19 




Ai541214 




Crmfl rwr^na-m4 ntnlom CDt9lf Chum^n 

oiudu {Mwiwiu<4i )N\Hoin or iMa inuJTion, 


1 OA 


1 9A 


408536 


AW381532 


K8.135t88 


ESTs 


1.55 


1.50 


408545 


AW235405 


HS.S3690 


ESTs 


1.00 


1.00 


408572 


AA05»11 


Hs22K68 


ESTs, J^toder3t8iysindartDALU4.HUiidANA 


1.00 


4400 


408633 


AW963372 


Hs.45677 


PRO20(R) protein 


107.16 


56.00 


408660 


AA525775 




ESTs. Moderately slnflar to PC429 feni 


1.00 


m 


408761 


AA057264 


H&238936 


ESTSk WeaMy similar to (deSlne not ava 


S224 


141.00 


408771 


AWr3S73 


Hs.47584 


potassjum voltage^ated chann^ delayed 


ao5 


109.00 
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408783 


AF192522 




MDf^ flit iiii ,1 — ■ JSn n n taMtyik /M MM 


4 fW 

1.QZ 


4 AT 


409790 


AW5K)227 




nsuTDtnfdiic tjffoans Mnase, receptor, 


44 4Q 


e4 m 


408805 


KS9912 


r1S.4CA9 


vaccinia relslBd kinase 1 




4g.UU 


40^1 


AW43ooo5 




CO 18 


4 nn 
1.W 




406873 


AL04Qj17 




c^bntxfalSR 2 (phos^^torylase Idnase, 


4 fM 

ijn 


Qo nn 
09.UU 


408908 


BE2962Z7 




sennenhreonins kinase 15 


7.76 


4 nn 
1.W 


408992 


AA05S3a 


Lh. 74CilO 


guanine nudeofide Undlng proton (G pr 


4 rift 
1.0Q 


t nn 


408^6 


Ai979lbo 




QVcopTDtein (transnteirdxans) nir^ 


an 


%9U 


409015 


Bc3o93o7 


nS.49/Q/ 


nnLvxm5£non)o Sapiens nauh oenyurogens 


4 AA 

1.44 


4 9A 
1.24 




T97490 




sntal induribte qftotane sultfaniBy A (Qf 


A 9& 
4uSo 




409041 






nypoineiicai pnHsn^ Arjra loou \Kvv\i i v 


11«.42 


4DR nn 

193.UU 


40OT77 






cols 


4 m 


47 nn 

If AM 




Du400J4 


Ue CtV/l>H 


V-AjHW pdMsSI 




1 en 


WiTlUo 




ns.i imw 


AAun- 1 pruiBai 






4^^42 


ALI0OO// 


cm CD 


oaiiJ4 (Suuciuiai mauusnanoe or cnromoso 


14*7 


fi/m 


409187 


Arl04oJU 


tie cAOce 


caibaipoyt'phosidiate synthetase 1, nvtodi 


1 nn 


1 m 

l.UU 


409228 


AlCCil 

Alm42So 




cois, vveaxiy Suniiar B> zioscoUA p ceu 


1 99 


1 nn 
liUv 


409234 


AiB784i9 


H5JZ7206 


cSTS 


4 nn 
1.0Q 


4 nn 


409268 


AAd25304 


Hs.187579 


COT8 


44 on 
11.80 


99 nn 


409269 


AA57o953 


n5.Z2972 


nypotnetical praran rU13352 


4 nn 


4 nn 
1.00 


409381 


NM-005982 


H5.544I6 


dne ocuBs homeofaox (DrosopMla} homok) 


168.91 




409404 


DE220059 




ESTs 


4 /M 


4 nn 
1.W 


409420 


zison 


nS.544al 


lamtnin, gsnma 2 (nic^ (lOQcD), 


79.74 


96X10 


409430 


R21945 


HS. 340/35 


^itong ractoTi aTpune^senne-ncQ 5 


1.45 


9 4n 
2.10 


409446 


AI561173 


nS.br boo 


col8 


4 M\ 


A nn 


409506 




nS.d4909 


n\j\ aoaptor proran i 




ZJ6.W 


409522 


A AfVTC^OO 




^)a9n87b03.8l Strat^ene ovarian cancer 


4C Ofi 
19.90 


141.UU 


409582 


AAAMItXl 

AMul ooy 


Ue 4Qn701 


CO 18 


4 nn 


17 nn 


409632 


W/4001 


Lie ec77a 


^rtie (or cysteine) proielnase inhibto 


909 49 


79iK) 


409705 


M37762 


Hs.%023 


Dratn-oenveo Reurouupntc laoor 


4 nn 
1.00 


Q9 nn 

02.UU 


409719 


AI769150 


HS.IO808I 


Honu sapiens brain tumv associated prat 


1.00 


4 nn 


409731 


AA1 25985 


Hs.56145 


thjnnosin, beta, (denfflied bi neiiitiUast 


0.12 


1Q 49 

10.12 


409744 


AWo75Z5o 


nS.boicoo 


Hotvo sapiens mRNA; cONA QKFZp586P2321 9 


on 7C 


CI nn 


409757 


N&L001898 


Hs. 1231 14 


1 — r*_ n^l 

cysiafin SN 


22.46 


I9.M; 


409866 


AVraiSIoZ 




j^U4 n LfC DDfks *Jtr 4 44 A 1 11 »4 MIU tttit* C 


4 nn 
1.00 


4 nn 
l.UU 


409893 


AW247u90 


Hs^lOI 


■ ■■Infill ■! ■■ III ■■■llT.il 1 ■ Mil J, iPmIiibiI #0 

nuucivunusonie rnainiBnance oeixaeni \o. 


1.S) 


1.09 


409902 


AI337658 


Hs. 156351 


ESTs 


25.92 


90.00 


4{^35 


AVv511413 


HS.Z76U25 


ESTs 


9 M 


9 44 
2.11 


409956 


Awl 03364 


Hs.727 


innsnn, beta A (acovin A, acovin AB a 


Z17 


A n4 
4.01 


409958 


NNLOOIaZS 


H8.57697 


hyaluronan syi^hase 1 


nD4 


9 AT 
2.07 


410001 


ADAA4fMe 

ABWiOSo 


Un C7W4 


Ksuocretn 11 


4 Ail 


9 90 


410032 


Bc0o5985 




QiKKuS-aTuSl 9-120200-014-309 blUJia HOmO 


4 nn 


CD nn 
90AW 


410037 


AB020726 


ns.do0u9 


t^AAnQ40 Mmlnln 


4 nn 
1.W 


91 nn 


410044 


BcSD0742 


HS.aolD9 


highly expressed fai cancer, rtcb in teuc 


4 nn 

1.W 


4 n\ 
1.00 


410048 


W76467 


Hs.58218 


prolinficoddase hcsnolog 


4 no 


4 AA 

1.44 


410076 


T05387 


Hs.7991 


ESTs 


4 49 
1.1< 


4 cn 
1.90 


410102 


AW24o508 


Hs.27g7Z7 


Honm Kpens cDAM rU 14035 OS, dDne He 


9.69 


4 nn 
1.00 


410153 


dE31192o 


Hs.15d30 


nypoiiietical protein rU12ml 


4 nn 

LOU 


4 nn 

1.UJ 


410166 


AK001376 


Hs.59348 


hypothetical protein FU10514 


1.00 


4 nn 
1.00 


410193 


AJ132592 


Hs.59757 


zinc finger protein 281 


ii9n4 


C4 nn 
51 AW 


410274 


AA381807 


Ll^ A4Ttf^ 

Hs,61762 


hypoxia4nducSile protein 2 


4 94 
1.7Z 


4 99 


410309 


Bc043077 


nS.27o153 


ESTs 


4 nn 


9 nn 
2.00 


410340 


AW182833 


Hs.1 12188 


t-^ ■ - i— IA^4J#I 

hypothetical praldn FU13149 


32.08 


75.00 


410348 


AW182663 


Hs.95469 


ESTs 


1.00 


1.00 


410407 


X66839 


Hs^287 


carbonic anhydraselX 


1.40 


1.11 


410416 


031362 


Hs.63325 


trsnsntemtvane protease, serine 4 


4.0U 


9 AQ 

2.09 


410438 


A5u377ot 


Hs.45207 


ttimothnllnnl .i.ntotn MAA49^C 

nypovietica] protein isiAAi<3«» 


4 nn 


40 nn 
loOO 


410553 


AW016824 


ns.255527 


hyjKjUieScal protein m(9C1412o 


1.34 


1.04 


410555 


W27235 


Hs.64311 


a di^tegrin and metaHopfoieinase doma 


2359 


4 AA 

1.41 


410561 


De54Q2po 


Hs.6994 


rionio sapteis cuna. r\j£am v&f done n 


4n tXA 


4 nn 


410681 


AVV24d890 


Hs.65425 


caitxnoin i, (2Bku} 


4n flQ 


40 OO 

10.92 


410781 


AI37^72 


Hs.1 65028 


ESTs 


1.00 


57 JX) 


411027 


AR)72099 


Hs.67846 


leuloocyiB l^iununaglUnf&hfike reoeptar, 


4 M 
l.K 


J. 70 


411074 


X60435 


1 1— ^AA«W 

Hs.68137 


adenylate cyclase acfivatng polypeptide 


1.00 


1.15 


411089 


AA456454 




c^ dMsion cycte 2*nke 1 ^SLKE pr 


1.K 


1.56 


411152 


BE0691K 




gDK3v3-BT0379H;1 0300-105^3 oT0379 Homo 


1.00 


OA nn 
B4.00 


411248 


AA551538 


HsJ34605 


Homo sqpiens cDNA rU144uB lis, oone HE 


1.82 


4 AC 

1.45 


411252 


AB018549 


Hs.fi9328 


M0-2pnrt^ 


7.3Z 


49 TJ 
12.74 


411263 


6E297802 


Hs.69380 


Idnesin-Ske 6 (fn&^ cenbomerd-assK 


9 AA 


2.99 


411355 


M76477 


Hs.289082 


GM2 gangOcsBie aonrator proran 


1.35 


202 


411402 


BE297855 


Hs.69855 


NRAS-relatBd gane 


1.00 


46j00 


41 10/3 


AB029000 


nS.70823 


K1AA1077 poifilD 


44 iin 
11.40 


44 9C 
11.99 


411579 


AuOOdZoB 


H$70630 


I ,A nvl A » - 1 - J n_ Rim 11 1 1 1 1 1 

IJ6 snRNA-assoaaleo STMxe proran LSnu 


1.(B 


1.90 


411617 


AA247994 


nSJ0063 


nsurocaldn dsita 


4 Til 

1.74 


<9 CT 

2.97 


411732 




Hs.71642 


guanine nucteoSde tinSnQ pfot^ (G pr 


4 m 

1.02 


4 nn 




Mil ftftCTOO 


HS.72Q26 


pratease, seriie, zi itesisin) 


1 9A 
ln>4 


9 40 

2.10 


411789 


At ^455113 


Hs.72157 


Adican 


9 40 


9TB 

2.79 


411800 


N39342 




mtrmhtfn ib^DCCnrfjrfpn DfEltBul iB 


23J4 


34.00 


411945 


AU33527 


HsS2137 


v-myc avian myetocytomatosis virai oncog 


1.00 


aoo 


412115 


AK001763 


Hs.73239 


hypothetical protein FUKQOI 


Z07 


1.64 


412140 


AA219591 


HS.738S 


RAB6interadir« ldne^>{to (raUdnes 


1ia48 


9m 


412276 


BE262821 


Hs.73798 


macrcfdiaSQ migrafion InWory factor ( 


1^ 


1.49 


412464 


T7B141 


Hs.22826 


ESTs. sbnSar to 15521 4 saSvary 


1.16 


1.34 


41^ 


AA766268 


Hs.266273 


hypolh8flcdpnitoaJ13346 


41^ 


84J00 


41237 


AL031778 




nuclear bansBripOonCEUtorY, a^a 


17J0 


SSJOC 
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412^ 


AW753885 


Hs.74376 


cractoneon relaed cR tocauzed prora 


14^ 


47,00 


412719 


AVW16610 


Hs.616 


ESTs 


382.46 


123.00 


412723 


AA648459 


Hsj35951 


nypoBefcapotam Ar«riZZ2 


54^ 


4 

I.OU 


412B11 


H08382 




ESTs 


1.00 


44 AA 
11.00 


412817 


AU137159 


Hs.74619 


proteasome (prosone, macropan) 26S subu 


1.63 


4 il9 

1.42 


41^3 


M121573 


Hs.59757 


zinc finger (volein ffll 


i7jm 


S6jOO 


412924 


6S01B422 


Hs.752S8 


H2A histom tarray, merraer Y 


1.00 


2Z00 


413004 


T35901 


Hs-75117 


oneieunn ennaroer Dnamg taoor 4 4 


Lvi 


9 AC 
Z.U5 


413011 


AW06811S 


Hs.821 


Uglycan 


1.22 


4 CtO 
1.(10 


413048 


M93221 • 


Hs,75182 


mannosB reoepior, C type 1 


0.30 


6l23 


413063 


AL035737 


Hs.75184 


cnitjnase 34Kb 1 {carirage giyoopfOB 


3;i43 


0 T4 

ari 


413129 


AF292100 


Ks.104613 


nr42noniox)g 


4.67 


4.77 


413142 


M81740 


Hs.75212 


omiOAiodecaiboxylasd 1 




4 en 
2.59 


413223 


AI732182 - 


Hs.191856 


ESTs 




17 AA 

Z/JW 


41S48 


T64858 


Hs21433 


hypofheSca protem DKFZp547J03o 


OlS 


1.06 


413273 


U75679 


Hs.75^ 


^enMoop (histORe) binding prot^ 


1^ 


4Q AA 


413278 


BE563085 


H&1633 


jmgferoA*6tiinuiateo protein, 15 KL/a 


I.IU 


4 AO 


413281 


AA861271 


Hs.222024 


tfSnscnpuon 13130/ dmalz 


96l94 


CO AA 


413384 


Bt536218 


HS.13TC16 


o^sn^sel 


I.W 


4 AA 

1.U0 


413385 


M34455 


HsM) 


indoleairineisyrrQts 2i3 dioxyganase 


OlS 


0 AO 

2.09 


413409 


Alo38418 


Hs.1440 


DcMjm (Asp<MUiAia*Asp/ni5) rax poiypep 


4 AA 


4 AA 


413453 


AA129640 


Ks.128065 


ESTs 


4 AA 


44 Ml 


413527 


6ES0788 


(&179B82 


nypotneocan protein ru i/44d 
seaetogranln U (chromoQranln Q 


4 AD 


4 At^ 


413554 


AA319146 


Hs.75426 


W 4C 


441 AA 


413573 


AI733B59 


ns.149089 


ESTs 


4 AA 
1.00 


4 AA 


413582 


AW295647 


Hs.71331 


hypoMcal protein MGCSdSD 


8w60 


4A AA 
10410 


413597 


AW302885 


Hs.117183 


ESTs 


1X0 


1.00 


413690 


BE1 57489 




gD:RC1-HT0375-12020(HJl l-e06 HT0375 Homo 


1.00 


1.00 


413691 


AB023173 


Hs.75478 


ATPasQi uass vl, lype lis 


3L16 


4 44 


413719 


BE439580 


Hs.75496 


snroD biduaole qrtoione sumaniily A (Qf 


2J8 


9.52 


413753 


U17760 


Hs.75517 


laminin. beta 3 (ncein (125kP). kaunin 


144.10 


108.00 


413801 


M62246 


H&35406 


ESTs, ^rnBar to unnamed protein 


1.00 


17.00 


413833 


Z15005 


Hs.75573 


cenffomere proem c (ai au) 


1.00 


1.00 


413882 


AA13^73 


Hs.1 84492 


ESTs 


6424 


148.00 


413926 


AA133338 


Hs54310 


ESTs 


1.00 


67j00 


413943 


AW294416 


Hs.144687 


Homo saptenscDNAFUl 2961 lis, done NT 


43.42 


42.00 


413995 


BE048146 


Hs.75671 


syntadn 1A (brain} 


1.23 


1.11 


414035 


Y00630 


Hs.75716 


serine (or cyst^e) proteinase Inh9ft} 


2.02 


Z51 


414142 


AW368397 


Hs34485 


Homo sapiens cDNA FU144381is,cloneHt 


1.00 


102.00 


414160 


AI663304 


H&120905 


Homo sapiens cDNA FU1 1448 fis, done HE 


632 


77iX) 


414245 


BE148072 


HS.758S0 


WAS proiein famly, member 1 


1.00* 


1.00 


414275 


AW970254 


Hs.889 


Charot-Leyden oystai protein 


1.00 


59j00 


414317 


BE253280 


Hs.75888 


phosphogiuconaia det^rdrogenase 


1.S2 


1.73 


414334 


AA824298 


Ks21331 


hypotheticai prot^ FLJ10036 


1.78 


1.72 


414341 


080004 


H5.75%9 


KIAA0182prot^ 


33.90 


151.00 


414368 


W70171 


HsJ5939 


uifdine monophosphate Unase 


171.60 


97^ 


414416 


AW409985 


HSJ6084 


hypothetical protem MGC2721 


2.32 


4 DC 

1.05 


414430 


A1346201 


Hs,7611B 


ufahiuiiin carixsq^^roiina! esterase LI 


228.15 


66J)0 


414570 


Y002B5 


Hs.76473 


InsiMike growth {actor 2 receptor 


1.64 


1.98 


414618 


AI204600 


HS.9597B 


hypothetical protein MGC10764 


1J7 


72.00 


414S75 


R79015 


K5.296281 


inteilei^ enhaicer binding factor 1 


4 C4 

1.51 


4 M 
1.09 


414583 


S78296 


Hs.76888 


hypometica protem M6C12702 


43.61 


D4.QU 


414696 


AF002020 


Hs.76918 


Nt8mann>Pick (fise^e, typed 




71 Aft 


414711 


A)310440 


Hs.288735 


Honn sapiens cDNA RJ13522 ns, done PL 


14.85 


42.00 


414718 


H95348 


Hs.107987 


ESTs 


1.00 


c m 
0.W 


414732 


AW41im6 


Hs.77152 


ndnidiromosoma m^ntenance defictent (S> 




1,44 


414747 


U30672 


Hs.77204 


centfomare prot^ F (SSQMOOkDt mito^ 


65.01 


7JI AA 


414761 


AU07/228 


Hs.77^ 


enhancer of zeste (DrosophB^ homolog 2 


130.35 


121.00 


414774 


X02419 


Hs.77274 


{dasminogen acSvator, urolcinase 


2.24 


4 40 
2.1V 


414806 


D14694 


Hs.773a 


phosftfiatidylsenne synthase 1 


1.63 


1.53 


414809 


AI434699 


Hs.77355 


transfenin receptor (p90, C071) 


1J7 


2.60 


414612 


X72^ 


Hs.77367 


monoMne induced by gamrrs interferon 


3.48 


10.60 


414625 


X0637D 


Hs.77432 


^lUennal growth factor receptof (avian 


10122 


143.00 


414839 


X63692 


Hs.77462 


DNA {cytosine-5-)-mefhyltransf8rase 1 


IJO 


1.69 


414883 


AA926960 




COC28 prot^ kinase 1 


14.29 


10.06 


414907 


X90725 


Hs.77597 


pdo prosophlaHIka kinase 


1.95 


2.20 


414914 


U49844 


Hs.77613 


ataxia telangiectasia and Rad3 reiatsd 


3.00 


2.90 


414945 


BE076358 


Hs.77667 


tymphocyte ant^ 6 complex, locus E 


1.02 


1.21 


414972 


BE263782 


Hs.77695 


K1AAC008 gene product 


1.00 


1.00 


415014 




Ha^4951 


ESTs 


1.42 


2.84 


415091 


AL044872 


Hs.77910 


34iydroxy<^flieSiylgiutaiyl-Coenzyme A sy 


1.00 


30X0 


415138 


CI 8356 


Hs,295944 


Gssue fadcv pathway InMbitDr 2 


34.72 


107.00 


415227 


AWB21113 


Hs.72402 


ESTs 


1.87 


49i)0 


415238 


R37W 


Hs.21422 


ESTs 


1.00 


1.00 


415263 


AA948033 


HS.1308S3 


ESTs 


IJXI 


i.ro 


415295 


R41450 


Hs.5546 


ESTs 


1.00 


1.00 


415339 


NMJD15156 


Hs.76398 


K1AA0071 protein 


51.18 


166.00 


415669 




Hs.76589 


serine (or qrst^ie) proteins inhibito 


M Oil 




415574 


BE394784 


Hs.78596 


proteasome (fffosorra; macrop^} sutiuidt 


1.48 


1.39 


415709 


AA649850 


H5^8558 


ESTs 


1.00 


1.00 


415735 


AA704162 


H8.120B11 


ESTs, Weakly simQar to 138022 hypoiheS 


1.00 


72.00 


415799 


AAe3718 


H822K41 


DKFZP434D193 protein 


6.23 


31X0 


415817 


U88967 


HS.78B67 


protein tyrosine phosphatase, receptor-t 


2430 


1.00 


415857 


AA88611S 


HS.127797 


HomosapienscONAFLJ11381 lis. done HE 


32.51 


35.00 


41S989 


A1267700 




ESTs 


78.69 


1.00 
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41G016 


AVrl 38239 


HS.7M77 


pro[volBb QonvertBse suUiSsfai/koxIn t 


4 AA 
1.0D 


1.00 




DcZo/agI 


Hs. 78998 


pfdIiferaSns obI nude3r anf^so 


9.99 


2.92 


416111 


AA099al9 


ns.7Su1o 


-* - — - ^ n ■ 1 1 1 ^ « 4 _^ -1 1* A J t 

creoinaun assenusy lactor 1 1 suDunn A ( 




4 AA 

3.00 


41B177 




ii_ 4(ncfl^ 

Hs.1o7d07 


ESTs 


4 AA 
1.00 


9.00 


415178 


AIouBSZ/ 


H5.192o2Z 


senrio^caQy defined breast cancer anG 


9.09 


Q7& 


418200 


AWZSIiH 


nSb4i^99 


coTSk WBaBy sonsar to muc2_ human mucin 


9.0/ 


4 An 
1.W 




AA236776 


r15.79Uro 


MADZ (finnic siTesi oeiSKni, y 


Ota 

a. 70 


4 nn 




At n^tUi^ 

/MJUOOHSU 


lie /ROM 




OO.Df 


mm 


416250 


/vwOIOSO 


ns>/ 0Hw£ 




■•90 


919 






Ue 70^7 


pyiToBne-S'Cartxttylats reductase 1 


C.UO 


1 n 






• nSiX009£l 


ESTs 


1 nn 


fionn 

094JU 




LI&elQ 




IfwtlSe ml-iiitnrlrtn kSejARe eekiMiK 4 

IkOI^ flflr PCIOSwffMjirW inQt okSoSbBi 9 


4 OS 

1.20 


4 CA 
1.94 


4 10490 


U99092 


nS./9951 


potassAsn donnel, sufa^sn^ 1^ rrsmber 1 


AIJS 


67J0O 


41D00O 


UU92/2 




Gbriin 2 (cof^einted contractural are 


99jfil 


51.00 


410001 


AAD94549 


Ll» tQAAfX 


iw-m inKNAH3inav\g protein 9 


3.90 


C AA 


415722 


AA354504 


H5.12S40 


■ hypotneScal protein FU23017 


3l68 


?^ 


416810 


U77735 


H&6U205 


[Ki!v2 oncogene 


1i9 


1.84 


416936 


N21352 


Hs.42987 


ESTs, VVieaidx drniar D S21 348 probaUe 
neurotensin 


1.00 


1.00 


417034 


NMJD0o1S3 


Hsj80982 


1.00 


1.00 


41#0D1 


Al079944 


Lin 40D<iA4 

KS.1BB091 


Homo sa|Mens ci^^ Rji2a39 ns, done He 


Vi AC 

92.35 


156.(X) 


417079 


U65590 


Hs«81134 


tntsrteiddn 1 receptiv ant^onist 


9 04 

9.91 


4.93 


4if2io 


/vV12s547 


nS209794 


pfo&Kincosena {hepatoQfte growffi fa 


4 AA 


CI AA 

91 .UU 


il470<M 

41 7299 


lAMCAne 

wzaoos 


HS.24^ 


■iiiiiil tntliialtilw ntnlihiii j-iihffimlhj Q 

smaB oioiraiBcyioiane suQiaiiuly o (uy 


9.90 


4 AC 
2.Q9 


AiTlM 
4f/«Hl0 


nDll72v 


U« P1QQ4 

nSiOlPsZ 


fSiAAOitn gene proouci 


BO OA 
02.84 


291.90 


417315 


AtftDnruo 


nS.1B049U 


noosorna pioiun o24 


4AC £4 
1110.01 


444 AA 
I21.U0 








CO IS 


1 7n 


1 9n 

l.cD 


4irO00 


ocl 09209 


riS.iU/0 


smaO profine-rich protein IQ (oorrufii)} 


R 07 


9.2/ 


4I/J09 


□c20w04 


nS.e2U49 


niaxins (neurue Qromn-piunjODng tssssx 


Z.99 


1 n9 


417428 


N87579 


u— «nftaT4 


gial2)^ Human ratal neail, LanuKla ZAP 


1.00 


52.00 


417433 


8E2702^ 


Ll#> CM4 0Q 

nS.o2l2o 


914 oncoietai ttopacAtasi giycopro^ 


9U4.79 


474 AA 

179.00 


417466 


Alffl1547 


Hs^9457 


hypotnettcal protein FIJ22127 


1.24 


1.34 


417512 


Ara79168 


. Hs.344095 


gtycoprot^ (transmembraite) nmb 


0 4JI 

2.14 


5.50 


417515 


L24203 


Hs£2237 


ataxia-telangiectasia group D-^ssodated 


Z66 


1.68 


417542 


J04129 


Hs .82269 




4 ta 
IM 


4 

1.35 


417970 


A AOMJIJia 

AAj99449 


Ue OOODC 

H&oZZoo 


(AttsphoribosylglycittarTddetoniylteansk^ 


A** 7e 
4Z.70 


51.00 


417715 


AW90S907 


rtSxoooD 


cSTS 


0.95 


9 7C 

2.75 


417720 


AA2055S 


U— «>AO*^gT 

nsJ0w)o7 


ESTs 


11X31 


56.(X) 


417791 


AW9 55339 


Hs.1 11471 


ESTs 


39.98 


16.00 


41rOi)U 


AWdu4/K> 


Hs.122579 


huiuiHmflr jl «r»tiiijn CI l4AilC4 

nypouieucai proietn rLJiU4oi 


9 £4 
2.01 


44 AA 


41700D 




Hsi2772 


ooDassn, type XI, a|)fia 1 




9 AA 

2.44 


4179QD 


f>e4SA447 

DcZ9Ui27 


Ue (Mane 


CDC20(ceBdivIdQn cyde 20i S. oerevi 


1.52 


4 44 

i.n 


417939 


a029UB 


n&o2902 


tnyRnjyiam syrnnetase 


4.74 




417944 


AUli771So 


r1S4}2909 


coBagenk type V, apiaz 


3.61 


5.21 


417975 


AAmIojo 


HsJOuBS 


nypoinKica protein FLJ231od 


1Z49 


38i)0 


417991 


AA79i45Z 


HS.130uOo 


ESTs 


1.00 


26.00 


418004 


U37519 


Hs.87539 


_1J _ fc -A _ i_ ■ ^ ^ tt-- ■ 

8iaenyD6 oenyorogenase 9 ranuiy, (nenser 


3JJ2 


212 


418007 


M13509 


HS.o9l^ 


matrix metaoproteinase i (tniejsDtia] 


187.S9 


4 AA 
1.00 


418054 


NM.002318 


Hs.83354 


iysyl oxidase^ 2 


2.85 


263 


418057 


NM_012151 


Hs.d3353 


ooagulafion factor Vlll-assoctated (intr 


1.54 


1.69 


418113 


AJ272141 


Q04e4 


SRY (sex determintng r^ion ifbCBH 4 


6l82 


5L22 


418140 


0cd1^3o . 


HS.B3S1 


cnioofibrQIar-assocIatBd protein 2 


1.26 


1.46 


418203 


X54942 


HS.B3758 


/>t\#*'>Q - ■ > » »J«,^i« « 

CDC20 protein Unase 2 


134.19 


144.00 


4182)7 


Ul4O09 


Ue 4/774 

nS.94772 




4 AA 


4 AA 


418216 


AAdo2240 


nS.2oo099 


API 5q1 4 proton 


64.66 


61.x 


410490 


AVAJOQ>IAAC 
AWsWwd 


Ue W7C^A 
rISJ9r904 


ESTs 


4D CQ 
IO.59 


4jI7 AA 
14/.UU 


418249 


H89226 


nS.34o92 


raAA1323 proiem 


30.53 


106.(X) 


4182^1 


1 Inn reft 


Hs.1154 


- ..-t — i^t II > nl i1- 4 4<WlLin ' -- o 

o\nQUciai gtyooprotetn 1, i20icu\niuQn9 


1.00 


3.(K) 


41K83 


S73895 


Ue OOOjIO 

Hs. 03942 


cafiiepstn K (i^fcnodysostosis) 


9.S0 


5.16 


418300 


A)433074 


nS.Boo62 


nonx> sapiens cONA FU2157o os, done C 


3L18 


291 


410922 


AA2041D0 


Ue Dil44 9 

HS.04119 


cycuHiepenaeni nnase uinioitDr 9 (CuK 


44 oc 

11.90 


0.00 


410927 


U70370 


U« <M44e 

H&04190 


paired-Gke homeodom^ transcriplbn b 


9.23 


2.22 


416345 


A IfMICBC 

AJ0Dio9b 


nSJZ4i407 


serins {or Qfsttfne} protrinssoinMbBD 


1.00 


1.00 


418379 


A A'MMJA 

AAZio940 


nS.i3751o 


fidgeBn^kel 


21.68 


44.00 


418397 


Mil nA4ien 


Hs.fl4746 


chromosonie condensaSon 1 


1XX) 


6.00 


416403 


Uoo97o 


Hs.84790 


K1AAQZ25 proien 


16.91 


18l98 


418462 


PqOOISSo 


Ue OCOCfi 


integiin, beta 4 


1.56 


1.16 


4ie47o 


U3B949 


Hs.11 74 


cyouHiepenaern nnase uinDHDr ^(ine 


3l22 


238 


418506 


AA0o424o 


Ks.65339 


G protrin-ooupted receptor 39 


2L66 


'222 


418526 


nf^4 r>AOA 


HS.B5838 


sotute carrier famQy 16 (monocarbogcyBc 


204 


221 


418536 


Dc244dZ9 


Ue oeac4 
HS.09951 


exporfbt tRNA (m^ear export receptor 


133 


37.00 


416543 


NMJXIS929 


Hs.85962 


hyahffonan syntt^e 3 


1.04 


1.23 


A4HSJA 

410974 


r«20r94 




M>ph8S8 ftfiQsptoprotsn 9 


48L60 


85,00 


410992 


vomv 

JCftfZZb 
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437916 


BE566249 


HSi?0999 


tiypoUieGcal protein FU23142 


2115 


89J}0 


437937 


A@17222 


tk 121655 


ESTs 


1.00 


1.00 


437942 


AI888256 


Hs^752& 


ESTs 


12.28 


31.00 


438091 






nttftlpar rpfiftnlar ftuUsinilv 1 oraun L m 


1.53 


10.85 


438113 


AI467908 


H5.8682 


ESTs 


i!eo 


239 


438119 


AW963217 


Hs^3961 


ESTs, Moderaleiy sbnilv to AP1 16721 89 


2.67 


36.90 


438274 


A191B908 


Hsi5080 


ESTs' 


1.00 


1.00 


438378 


AVV97052d 


H&86434 


h^yptfiefoal (TOl^ FU21816 


38.92 


3100 


438403 


AAB06807 


Hsi92206 


ESTs 


1.00 


IDO 


438494 


AA9(^78 




ESTs 


2.05 


60.00 


438546 


AW297204 


HS.125B11 


ESTs 


1.00 


131.00 


438552 


AJ24S820 


1^6314 


^p6 1 transinGRibiBno fscsptof (soizur^f 


1.43 


1.45 




AI879084 




ESTs 


1.00 


34.00 


438724 




HS.11467D 


HuffWft DNA MQUCflOP frpffi doftft RP1 1-16121 


1.33 


1.10 


438746 


AI685815 


Hs.184727 


^^^1^^^^ nnslano^f^"ciS60ctotetl anfi^an p97 ^ni 


Z42 


li9 


438779 




H&;6414 


nuc)i>fllar nfntidn A 


1:00 


18.00 


4^^21 






ESTs 


2D3 


2.57 


/lococ 

•♦O00O9 






ESTs 


6.42 


86.00 


438898 


AA401369 


>&1 90721 


ESTs 


22.41 


17.00 


43^15 


AA280174 


Hs^56B1 




1.00 


1.00 


438956 


W00847 


Hs.1 39)56 


Humm nNAsfiOiiAficft finmn f^m RPS-850Efi 


2J0 


1.88 


439000 


AW97Q121 




ab'EST391231 MAGE leseouences. MAGP Homo 


Z78 


4.81 


439023 


AA745978 


Ks^273 


ESTs 


1.17 


1.31 


439024 


RS6896 




ESTs 


lio 


28.00 


439128 


AI949371 




ESTs 


1.00 


67.00 


Ha3 1^ 


AW13R909 


H&ISfillO 


ifnnnifMfllnhiirm kstma Bfinctani 


1.38 


1.41 


439223 


AW23S299 




ULIB UnrCm nrat^ 2 


1.93 


1.64 


439285 


AL1M916 




hypotheficai (^otan FU 20033 


46.23 


139.00 


439316 


AW837046 


Hs.6527 


0 pfof^^^fOUpted raoGptor 56 


200 


220 


439343 




Hs.1 14611 


hvnnihaflcal motMii PI JIlBOft 


6.10 


7.37 


4^394 


AA401369 


Hs.190721 


ESTs 


3^9 


17il0 


439410 


AA632012 




ESTs 


1.83 


107 


439451 


AF086270 


Ks.278554 




23.28 


5200 


439452 


AA918317 


Ks.67907 


B-cell CLUIyniphonts 11B (zinc linger pro 


laTB 


12200 


439453 


BE264974 


Hs.6568 


thyvofd fiofTTiDito r803ptor bite^sctor \ 3 


278 


1.58 


439477 


W^13 


H&.58042 


ESTs.iytadsfatslv8bnlartoGFH3 HUMAN G 


li2 


1.44 


439492 


AFDRfiSIO 


1^109159 


ESTs 


7.46 


39.00 


439523 


W72348 




ESTs 


1.00 


1.19 


439592 


/^^086413 


Hs.58399 


ESTs 


1.00 


1.00 


439605 


W79123 


H5.58561 




33.61 


1.00 


439670 




KSi59507 


EST&We^^nflvtoAC004BS3U1 sm 


1.00 


1.00 








CO la 


4.30 


10.00 


439705 


AW87K27 


K&59761 


ESTs lAtealdv&imBartoDAPI HUMAN DEATH 


86J»5 


11,00 








Mtna rinmflln, hnmiinnainhuRn dninsin fla^ 
ooiiia UMiidiit| u( u 1 nil wj^juuuui 1 uuiiioiii ^ly/i 


2.36 


1.88 


439750 


AL359053 


Hs.57664 


HfvnA fiST^sns mRNA full tanafh Insert cDN 


2.02 


6.08 


439759 


AL359055 


Hs.67709 


HofDO sapfens mRNA ftjQ length Inssft cON 


1.00 


21JJ0 


439780 


AL1 09688 




oh'Hfwnn eanlpns mRNA hjt] Isnoth in<;ert 


7.27 


2100 


439840 


AW449211 


Hs.1 05445 


RONF fmlhr fPCPBlof atflha 1 


1.00 


1X0 


439926 


AWD14875 


Hs.137007 


ESTs 


3258 


71:00 


439983 




KkRTtn 




21.26 


9.55 


439979 


AWB00291 


^5823 


hvnolhAfieai nmtein FU 10430 


68.83 


61.00 


440008 


AKD00517 


H&6844 


hvnofhfifiesd onrf^ PLJ2Q510 

Iff |MAflwVvOI jMViCNf f 


li3 


4^12 


440028 


AW473675 


Hsi 25843 


BSTs, V\fe^ siRflar to T172Z7 hypoOistl 


1.42 


254 


440108 


AAffi4968 


nS> l&f Q39 


MAAIBOSmdrin 

IWV\ IDW |lllwBlll 


IXX) 


54.00 


440138 


AB033023 


H&J1fi127 


hvnofhaSeal anririn Fl .HfWftl 

IIJ|JwUIT>IImiW I^WWhI I MP i 


24.18 


5200 


440273 


AIB05392 


H&J25335 


Mf»nft gflntwtit rilMA* B J?3S?3 fe. dtHW L 


3il 


4.72 


440289 


AW45Q&91 


ns>i9<Uf 1 


ESTs 


38.63 


11100 


440325 




N^7164 


8 disid^nn and fl^^tdflopiolBinasa tfajiifl 


6288 


147.00 


440492 


R39127 


Hs^1433 


hypOlheScal protan DKFZp547J036 


235 


162 


440527 


AV557117 


}te.184164 


ESTs. Moderately sintoto S65657 ^Ipha 


10M 


57X0 


440659 


AF134160 


Hs.7327 


daudini 


116 


237 


440704 


ryt6924i 


Hs.162 


[nsuDn-like gioivfh factor bindnq pi^ 


289 


209 


440943 


AW0822S8 


Hs.14ai61 


hypoOisSced prot^ MGC2408 


202 


1.41 


440994 


AI160011 


Hs.272068 


ESTs 


1.29 


1.14 


441020 


AA401369 


H5.190721 


ESTs 


14299 


17X0 


441031 


AI110684 


Ks.7645 


fibrinogen, B beta pcfypepfide 


1.41 


99X0 



PCTAJS02/12476 



163 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



44112B 


AA57Q256 




ESTs, Weaidy siniTar tDT23273 hypoM 


4.13 


aso 


441290 


W27501 


Hs.69605 


cbotoTsr^ receptoi, nicoSnb» siph3 p 


1.00 


100 


441362 


6^14410 


Hs.23044 


RA051 {S.ceievistae) homobg (E c6B Re 


13QL23 


43X0 


441377 


BE216239 


H5.202658 


ESTs 


22j03 


1.00 


441390 


AIG9S60 


Hs.131175 


ESTs 


a65 


7.70 


441497 


R51064 


fte^172 


ESTs 


1.00 


1.00 - 


441525 


AW241867 


Hs.127728 


ESTs 


1^ 


1.« 


441553 


AA281219 


Hs.1 21295 


ESTs 


139 


1.57 


441607 


NM.00^10 


Hs.7912 


neuronal cefl adhesion molecule 


1.47 


211 


441833 


AW^6544 


Hs.1 12242 


nonnd nuiDosa of e9]pha9us sfsdfic 1 


216.22 


36100 


441636 


AAQ81846 


Hs.7921 


Hm s^ens mRNA; cDNA DKFZp568El 83 (fr 


2.31 


2.0S 


441737 


X79449 


Hs.7957 


sdono^s (teaninssfl^ RHAr^KcSc 


1.30 


1.49 


4417S0 


AA401369 


H5.190721 


ESTs 


44.15 


^7JD0 


441801 


AW242799 


Hs36366 


ESTs 


1.00 


1.00 


441919 


AIS3802 


Hs.1 281 21 


ESTs 


1.00 


122.00 


441937 


R41782 


Hi22279 


ESTs 


0.86 


157 


441954 


AI744935 


Ks.6047 


Fanconi anern^ compiementa&n group G 


1.48 


1.39 


442025 


AM87434 


Hs.1 1810 


(XA11 prcrtgin 


1.00 


46.00 


442029 


AW35SS9B 


Hs.14456 


neural precursor cell expressedi develop 


9.92 


45D0 


442072 


A1740832 


Hs.12311 


Hwno safuens done 23iS70 niRNA sequence 


25J)5 


77X0 


442108 


AW452649 


Hs.166314 


ESTs 


aei 


114 


442117 


AW664964 


H5.t28899 


ESTs 


100 


' 5.49 


442137 


AA977235 


Hs.12a83D 


ESTs. Wea^ siiralar to Z19^HUMIAN ZINC 


1.00 


1.00 


442159 


AW163390 


Hs.278554 


heterochromatin-fike protein 1 


1.92 


1.66 


442179 


AA983842 


HS.3335S 


duomosome 2 open leadino frame 2 


27.22 


50.00 


442328 


AI952430 


Hs.150614 


ESTs, Wealdy similar to ALU4 jnJMAN ALU S 


5,00 


3.42 


442432 


B£093589 


H&38176 


hypQ&elical protein FU2346B 


181.59 


76.00 


442530 


AIS80830 


Hs.176508 


Hotno sapiens cONA FU14712 «s, clone NT 


10.59 


144.00 


442547 


AA306997 


H&217484 


ESTs, Vfeddy simOar to ALU 1.HUMAN ALU S 


109.23 


9100 


442556 


AL137761 


H&8379 


Honn sapiens mRNA; cDNA DKFZp566L2424 (f 


1.00 


53.00 


442619 


AA447492 


Hs.20183 


EST6» Weakly sinfla to AF164793 1 pnte 


29.02 


50.00 ^ 


442710 


AU)15631 


Hs!23210 


ESTs 


1.00 


19J0O 


442717 


R88362 


H8.180591 


ESTb, Wedtfy sbnSar to T23976 hypoBieti 


1.00 


&Q0 


442875 


BE623003 


Hs!23625 


Homo sapiens done TOCCTA00142 mRNA sequ 


2255 


50.00 


442914 


AW1BB551 


HS.S9519 


t^poihe&al pKilsSiR FIJ14Q07 


25.33 


82.00 


442932 


AA457211 


^.8856 


bromodomain Bi^scent to sine finQer dome 


3.18 


4.41 


442942 


AW16TO87 


Hs 131562 


ESTs 


a45 


64.00 


443068 


A1188710 




ESTs 


1.00 


27.00 


443204 


AW20587B 


Hs3643 


Homo sapiens eOHA FU13103 (is, done NT 


1XK) 


24iM) 


443211 


AI128388 


Hs.143655 


ESTs 


12.42 


2.00 


443247 


BE614387 


HS.333S93 


c-MyctaigetJPOl 


128J4 


9100 


443324 


R44013 


^1642^ 


ESTs 


0.02 


4L59 


443383 


A1792453 


Hs.1 66507 


ESTs 


1.00 


47X0 


443400 


R2B424 


Hs.2S064e 


ESTs 


18.52 


61.00 


443426 


AF098158 


Hs.9329 


chromosoTia 20open reatfing frame 1 


4.02 


1.75 


443572 


AA025610 


Hs.9605 


deavageand poIyadenyiaSonspecHicfa 


Z93 


2.57 


443575 


A1078022 


Hs.269636 


ESTs. We^ simBar to ALU1.HUMAN ALU S 


1.0D 


29.00 


443614 


AV65S386 


Hs.7645 


fibrinogen, 8 beta polypepSde 


1.00 


16.00 


443633 


AL031290 


Hs.9654 


simflar to pregnancy-assodated p^stna p 


1.00 


39.00 


443646 


AI085377 


Hs.143610 


ESTs 


39.81 


70.00 


443715 


AI583t87 


Ks.9700 




4a74 


7.00 


443723 


A]144442 


H&157144 




1.29 


1.30 


443802 


AW504924 


Hs.9805 


KIAA1291pRM 
fbSstafin 


1.^ 


1.61 


443859 


NML013409 


Hs.9914 


1.35 


1.13 


443892 


AA401369 


H8.190721 


ESTs 


1.00 


17X0 


443947 


VV24187 




gb2b47l09/1 8oaresLlelaLJungLNbHL19W 


1.33 


1.64 


443991 


NA1.002250 


H8.10082 


potassium Intei 1 1 ledlatoftmsB oondudenoe 


5.71 


6.87 


444006 


BE395085 


Hs.100e6 


^pe I frsnsmernbrsne protdn Fn14 


1.47 


1.92 


444009 


A1380792 


Hs!l3S104 


ESTs 


14)0 


77.00 


444017 


UD4840 


HsJ14 


neuRHXicologicd ventral anBgen 1 


1.00 


1.00 


444127 


N63620 


Hs. 13281 


ESTs 


1.00 


29X0 


444129 


AW294292 


Hs.256212 


ESTs 


1.00 


1.00 


444279 


U52432 


Hs.89605 


ctiplinergic rBOBptoTi ncofiric. a^p^ia p 


aeo 


7X0 


444371 


BE540274 


Hs!239 


foffcheadboxMI 


2.91 


1.14 


444378 


R41339 


Hs.12569 


ESTs 


14X) 


1.00 


444381 


BE387335 


Hs.283713 


ESTs, Weddy simliar to S64054 hypoihd 


469.00 


55100 


444461 


R53734 


HS.K978 


ESTs. VMdy siRi&ar to 2t(326QA B cell 


12.88 


105.00 


444471 


AB020664 


Hs.1 1217 


KIAA0877pmtelh 


24.91 


90.00 


444489 


AilSIOlO 


Hs.1 57774 


ESTs 


1.00 


111.00 


444619 


6E538082 


HS.B172 


ESTs, Uoderately sinSar to A46D10 


1.00 


70X0 


444665 


BE613126 


Hs.47783 


B ^gressive lymphonQ gene 


30£6 


139.00 


444707 


A1168513 


Hs.41^ 


^esmocoHin 3 


1.00 


1.00 


444735 


BE019923 


Hs.243122 


hypoiheSca! pfi4^ FU 13057 simBar to 


njoz 


90X0 


444781 


NiyL014400 


Hs.1 1950 


GPl-andnved metastasis-assodatad |VDto 


1.57 


1.31 


444783 


AK001468 


Hs.62180 


anSEn (Drosophita Scraps homdog). act 


77.55 


200 


445236 


AK001676 


Hs.12457 


hypothetical protein FLI10814 




27.00 


44S25B 


A1635S31 


H5.147613 


ESTs 




73X0 


445413 


AA1S1342 


Hs.12677 


061-147 protein 


28.14 


50X0 


445417 


AK00105B 


H5.12680 


Homo sapiens cDNA FU10196 Ss. done HE 


1.81 


2X2 


445443 


AV653838 


Hs.322971 


ESTs 




1X0 


445462 


AA378776 


HS288849 


hypsthefBal protein lytGC3077 


Z(B 


1.70 


445517 


AF208855 


H5.12B30 


t ji- ^ J* — ■ ^» - »- 

nypouieuca poKsn 


\JSf 


7OX0 


445537 


AJ245671 


Hs.12844 


EGF41ce4oma^ mdSpte 6 


1.71 


Z72 


445^ 


AF167572 


Hs.12912 


skbl (8. ponAi^ homtrfog 


1.S2 


1.34 


445654 


X91247 


H5.13046 


uDorBQOXtn iBoucQse i 


1J1 


1.S2 
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445^ 


>U570830 


HS.17487D 


ESTs 








ESTs 


It 30(0 




Hs^1d46 




44^5 








445898 


nrvi vwfcv 


Hs.13423 


Homo sapfens dons 24468 niRNA pegnencg 














Hs.333555 


Homo s^pisiB dono 24^9 iiiRNA S6tpi6ftC6 








p6SC3^Io {z^brsSsh} fuuiulOQ If coJtliii 


446078 


Af339382 


Hs.156061 


ESTs 


44S102 


f\lT lUUWVI 


Hs317694 


ESTs 


446157 


RP97Dfl9B 


HS.13174D 


Uomfi fifl&ifins dVIA: PL122fifi2 fis. danfi H 






(fa.14559 


iljyMH tCMMM inWWil rMl IWw IV 


446292 


>^TO1497 


Hsl279682 




446293 


At420213 


Hs.149722 


ESTs 


446423 


AW13dfi5S 


Hs 150120 


ESTs 


448428 


AWD8227D 


Hs.12496 


ESTs, We^ dndsr to ALU4 JftAiMN ALU S 


446432 




H« 150058 


ESTs 






Hs.15243 










ESTs 


446619 


AU076643 


Hs^13 








Hs 15767 


cSron (rtu)^nl6r8Gfln0i ssfinsfDtfBOi^i 




AW1 193^3 


Hs 141887 


ESTs 


446839 




Lfe icOAd 


nritoGc spntdb ooB6d*ctf fslstBd pot 




Alin7ltf%17 




ctesvsQfi 8ful pofyBdonybSon spBcRic 6 






ns. 1 W*r 1 # «/ 


ESTs 










■rnXiOU 


Alft11RQ7 
nlQl lOU/ 


Hk.1 08646 


MnnnfianiBnscDNA FlJt4934 fis. donfi PL 






HS.1653D 


snisA u^iKiiLb cykddiw suUsndy A (Cy 






Hs.16740 


hanaShpScsS nrttfsm FIJ11Q36 


447022 


AW291223 


Hs.1 57573 


ESTs 






rut, 19 f QUI 


ESTs 


447078 




Hs.9914 


ESTs 


447081 


Y13696 


HS.172B7 


Bftfaguaiim {rwKrrihi-^&rtHtftna channel s 


■Htf l<31 




}-k 17ififi 




447149 




Hs.326 


TAR fHM RMA.^ina om^ 2 






r*a.O 


ESTs 






Me 17(t1fl 
n5>ii3IO 




447178 




) 9<*r 1 i 


ESTs 


447250 




Hs.1 7883 


ntrdptn nhfYiinhsibteA 1 R fffumprtv Rid 






Lie 9fiQ7R 


URHdiwITla (SlU^aili ifiDiiny Ml 0 


447342 




Hs.1 9322 


Hnmfi enntpne. ^^mlbv In RtKEN cOMA 201fl 








P<^e HiAhhr eimSar In (>n9^Q9 ;i)nhA.9.in 


447350 


AJ375572 


Hs.172fi34 


ESTs 


M7T77 




HS-334334 


franewtnfinn Car4rv AP-9 afnhfl facfivst 

UallSHitlpUMii lOUm /V^~& OlplKI ^OmW W 






Hs.28149 




447425 


AI963747 


Hs 16573 




447519 


U46258 




ESTs 


447532 




H« 1R7Q1 


hvnnfhRfiral Bmioln CI -I9flfifl7 


447534 


AA401369 




ESTs 




VIOfUS 

T lUUw 




Ifa^lHnobO^ youp (fionhislOTfi cftrofposo 


447888 


N87079 


Hs 19236 


TsgatCAT 


447733 


AT ivf WJi. 


H5.19400 


MAD? /ndtelfemas) itrfdanL vaasL h 


447769 












lie IRIAtK 


ESTs 






M« 1Qft79 
ns. iwiM 


CPP9^ /Q MCMiielap) rpljrffwf npfM famH 

OCV'f^ \0< wwWiSMClOJ IBIOUM \|oro 101 IW 








PfiTe Waaktu similar In T9'^1 10 hvnOihftfi 








StfUilai HI O. UflCraiaa OOfm 








11 (taiuM ano*9|MniKJlg sniwiiioUKii DUUtaiiutj n 






tie 0QR9A1 




446243 


AW^G<J77t 


He C9fi9n 


intsQiini botB 6 


448278 




Hs.11^2 


ESTs 






He 9nAA? 








ns> Iv949 


IWtm luv^e rDNA PI J14162 fl& danfi NT 






tie insQO^ 


tVMMOi iTicjiiuoi Tv%o unmyBiic idiiinj 


448390 


AL035414 


n9.< luoo 


hypotho&si protein . 






He OiTTK 


hvnnHmfleal Amlelfl FIJ1 101 1 








eilntial hamftiKAr nnil aeAvsAor of trans 




DCvl*KlS9 




bypoUidlcflB proteb) h^5Cl4797 






Ue 9964 nfi 

nS.22aiU0 


C0i« 


446733 




He IftTCKR 
nS.IOf 930 


enftitn rsmnfft famlfv fi fftBUmfraRSnilSa 


448741 




He 1(yt7A 




iiR?^ 




Ue2JW9n 


TATA hrw WiwPfM nmftf^ fTBPVafifiaeialft 


448775 


AB025237 




fiiiffiw fmiriMMtiftn fllfthosDhalft Bnlml mfll 

lltltlH ilHIUvUOHW uipiiuapfww WHWii iimi 








COl9| TlBMnlj BlIIBIBI IV |HMaUvV pi mW |ri 


'rWWJw 




He 991ft1 




448844 




He1771fid 

ns>i«' 10** 


ESTs 


448988 


Y09763 




flafnma.am}na)uilvrifi fftARA^ A iProrfn 


448993 


AM7t630 




KIAA0144gene(tfodocl 


449003 


X76342 


Hs^ 


alcohol dehydrogenase 7 (dass IV). mu 0 


449029 


N28989 


HS.22B91 


sotute canisr temOy 7 (caSonic airino 


449040 


AR)40704 


HS.14944S 


putaSvo tumor su{^iressor 


449048 


245051 


Hs.22920 
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TABLE 9C 

Phey: Ui^ue number correspoiuEng to an Eos probessi 

Raf. Sequence source. The 7 digS numbers in IhbooUmv) are Genl»nkldenSier(GQ numbers "Dunharn Let at' refers to the pubUonenSfied The DMA 

sequence ofhumandvomosome 22.' Dunham I. et si. Nature (199Q402:48949& 
Strand: fauScates DNA strarxl ftom which exons wen predicted. 

NtposISon: IndlcalesnudeotkieposffionsofpTBfictede 



Pkey 




Strand 


Nl_po^tlon 


400512 


9798593 


Minus 


1439.1615 


400517 


9796688 


Minus 


4999&-50346 


400560 


9843598 


Plus 


94182-94323.97058-97243.101095-101238,102824-103006 


400664 


6118496 


Plus 


13558>13721.1394M4090,14S54*14679 


400665 


8118496 


Plus 


16879-17023 


400666 


8118496 


Phis 


17982-18115,20297-20456 


400749 


7331445 


Minus 


9162-9293 


400763 


8131616 


Minus 


35537-35784 


401027 


7230983 


Minus 


70407-70554,71060-71160 


401093 


8515137 


K&ius 


22335-23166 


401203 


9743387 


Itas 


17^81-173056^173868-173928 


401212 


6858408 


Plus 


67839^8028 


401411 


7799787 


Mbms 


144144-144329 


401435 


8217934 


Minus 


54508-55233 


401464 


6682291 


Minus 


170688-170834 


401714 


6715702 


P&B 


96484-96661 


401747 


9789672 


Minus 


118596-118816,119119.119244.119609-119761J20422-120990.130161-130381,130468.130533.131^^^ 








131932.132451-132575.133580-134011 


401760 


9929699 


Pius 


8312&«250.6532&«5540.94719^5287 


401780 


7249190 


Mnus 


28397-28617^20.29045,ai35-29296^9411-29^,29705-29787,30224-30573 


401781 


7249190 


b£nus 


63215-83435,83531-6365633740^01 34237<«4393.84^545037.e6^0^14 


401785 


7249190 


Ites 


165776-1^98,166189'166314.166408.166563.167112.167268,167387.16746ai68634-168942 


401797 


6730720 


Pius 


6973-7118 


401961 


4581193 


Minus 


124054-124209 


401985 


2580474 


Plus 


61542-61750 


401994 


4153858 


Minis 


42904-43124.43211-43336.4460744763.4519W5281.46337-46732 


402075 


8117407 


Pius 


121907-12a)35i122604-122921,124019-124161.124455-1246mi25672-12&076 


402260 


3399^ 


Minus 


113765-113910,115853-115765^116808-116940 


402265 


3287673 


Plus 


21059-21168 


402297 


6598824 


Rus 


927945405^3557345659 


402408 


9796239 


MbHis 


110326-110491 
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402420 


979S339 


Phis 


129750-129919 


402S74 


6077108 




33290>3KQ2 


402802 


32B7156 


Afinus 


53242-53432 


402994 


S96643 




4727-49© 


403137 


9211494 


^fi^us 


92349-92S72.92958-93084,93579-9371 2^9>94072.94591'94748.95214-95337 


403305 


80^45 


Plus 


127100-127251 


403329 


8516120 


Rus 


96450^8588 


403381 


9438267 


UlTUIS 


26009-26178 


403478 


9958258 


Plus 


1164S-1 16564 


403485 


^66528 


Plus 


2888->3001,31984532i365S4117 


403827 


8569879 


lifinus 


23^24342 


403715 


7239669 


Phs 


85126-65292 


404044 


9558573 


Uws 


225757-225939 


404076 


99317S2 




3848-3S67 


404101 


8076925 




12S742.1S997 


404140 


9843520 


Plus 


37751-38147 


404165 


9925489 


Minus 


69025-69128 


404185 


4572564 


Minus 


123171-129327 


404210 


5008248 


Plus 


1699B-170121 


404253 


9367202 


Minu& 


5S675-560S5 


404287 


2326614 


Plus 


53134-53281 


404298 


9944263 


kta 


73591-73723 


404347 


9838195 


Plus 


74493*74829 


404440 


7528051 


Plus 


8043(^1581 


404721 


9856648 




173763-174294 


404794 


4826439 


Phis 


101619-101698 


404854 


7143420 


Plus 


14^14537 


404877 


1519284 


Plus 


1095-2107 


404927 


7342002 


Plus 




404996 


6007690 


Plus 


a7998^145L38fi52-38S9&^9727-39872.40557-40874.4235M2450 


406449 


7622497 


Plus 


42236-42S70 


405568 


6006906 


Plus 


35912-36065 

w^tf i£ www** 


405572 


3800891 


Plus 


85230-85938 


405546 


4914350 


Plus 


741'969 


405676 


4557087 


Plus 


73195-73917 


405770 


2735037 


Plus 


61057^2075 


405932 


7767812 


Nfinus 


123525-123713 


406137 


9165422 


tySnus 


30487-31058 


406360 


9256107 


Minus 


7513-7673 


406399 


9256288 


Minus 


£3448,^3554 


408467 


9795551 


Pius 


182212-182958 
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TABIEIOA: PotenM Therapeutic, Kagnosfc and Prognosfic targets terTher^ 

T^2A shows about 307 genes up-regulated in non-malignant lung disease relafivs to 
rtom^ tung and nGn4nars9narn tung ^asa. These genes were select 

Table 108 show eieaoxssfon numbers fctf those Ptce/siaddngU 1^ each probesef we have fisted the gene dustvnuint)erfr^ 

o&gonudeotidesweredKi^iel Gene dusteis vera compiied using sequences derhredfr^ These sequences were dustered based on sequence 

siniSarfiy using Ghsstering and ARgnn»m Toots poufaleTwf^ The Genbankaooes^ nutters for sequences cotnprfstag each cluster are 8^ 

'Accession' cohiim. 

Table lOCshowfhegenomteposiBontngfbrfiiosePkeyslackb^ For eadipredctedexon, we h^ra listed (he senotiric 

sequence source used Cor predicGon. NucleoSdetocalions of each predicted exon are also Bstai 



Ptey:t Uhique Eos prttesetidenlilier number 

ExAoBL' Exenif^ Accession fiurnbeT) Genbsikaooes^orifNontjer 

UiagaielD: Uhigene number 

IWgene TUls iMgene gene tiUe 

Rt: Average of lung himorB (including squamous ceB carcinomas, adenocarcinomas, smd ceB carcinomas, granulomatous and carcinoid tumors) cfivided by the 
average ctf norm^ tung ssm^/iei 

R2: Average of non-m^ant tung disease samples (inchiding broncMQs, emphysema fibrosis, atelectasis, aslhma) divided by the average of normal tung samples 



Ptey 


ExAocn 


UnigenelD 


UnigeneTiUe 


R1 


R2 


404394 






EN$P00000241 075;TfWAP PROTEIM 


a79 


3.10 


404916 






Target Exon 


1.W 


159.00 


405257 






Target Exon 


1.00 


422.00 


407228 


M25079 


Hs.155376 


heniogtobin, beta 


047 


233 


407568 


AA7409&4 


Hs£^ 


ESTs 


1.00 


123.00 


408562 


AI436323 


lteJ1141 


Homo sa^ns mRMA for lOAAISOS protein. 


too 


230.00 


409031 


AA376836 


HS.7672B 


ESTs 


1.00 


i2aoo 


410434 


AF051152 


Hs.63658 


tdlMike receptor 2 


39£5 


149.00 


410467 


AF102546 


KSJ63931 


dacnsnQna (unssopnim nornmog 


1.00 


loaoo 


410808 


T40326 


H$.167793 


ESTe 


1.14 


iai4 


412351 


AL135960 


KS.7382B 


T-ceO acute lymphocySctetdieina 1 


a37 


2.27 


412372 


R65998 


ffe285243 


tiypothetcd protein FU22iQ29 


in 


173.00 


413795 


ALJ04O17B 


Hs.142003 


ESTs 


aio 


IliO 


414154 


AW205314 


Hs323060 


ESTs 


a62 


Z09 


41^14 


D4935B 


Hs.75619 


gtycQprot^ NSA 


0.03 


4.55 


414938 


NM.002543 


Hs.77729 


oxidised bw density lipoprotein {lec&i 


0.64 


297 


415122 


D60708 


Hs^45 


ESTs 


0.07 


8.97 


415765 


KM.005424 


Hs.78824 


tyrosine tdnase uri&i bnmunoglabufin and 


a67 


1.65 


415775 


K00747 


Fte.29792 


ESTs, W^aldysanSar to 138022 iQfpotheS 


029 


264 


415910 




Ks.76913 


chemoUne (C-XSQ receptor 1 


1.00 


145.00 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



416319 


Ai81^1 


H5.79197 


416402 


NMJ00O715 


H3.IOI2 


417^ 


D13168 


KS.820Q2 


417421 


AL138201 


Hs^120 


417511 


AL049176 
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^90202 


KS.1550D1 


UNCI 3 (C e]egans}-rike 


085 


1X6 


425771 


OCwW 1 f f w 


Hs.1 59494 


Orutpn rpn3QlobiiiiTT5fTi?3 tyrosine fanfls 


1.18 


Z56 


42B486 


DC 1 / OtXW 


R«; 170066 


Homfi sanans mRNA: cDNA OKFZd^B022D ff 


1.00 


76.00 


427507 




H*: 17Q1S2 


tnlUikp iPMfitor 7 


1.00 


63.00 


427618 


MM 000760 


Hs.2175 


nraofttf &ftiitt_B#innfi fdfiutf 3 ffifififlior iflr 


0.60 


2.19 


427732 


NM 002980 


Hs.2199 




0.97 


1.42 


427952 


AA^53fi8 


Hs.293941 


ESTs, lytoderately similar to A539S9 ttvom 


1.W 


105.00 


4287(B 


BF2fiS717 


Hs.1 0491 6 




1.00 


80.00 


428769 


AW90717S 


Hs.1 06771 


ESTs 


O09 


2.55 


428780 






ESTs 


1.00 


98.00 


428833 


A{Q9fl355 


Hs 185805 


ESTs 


1.00 


iiaoo 


429657 




Hs.2465 


KIAA0001 oBfiA nmriud! nuhrih/a &4tfQlri 


1.00 


52.00 


430212 


AA469153 




(rfEnc67flD4.fi1 NCI CGAP^ Homo saoiens 


1.00 


132.00 


430226 




KsJ55t 


fldrsiiflfyic bfll^^i rsoeplor, surfBoe 
cluoinQfloino 1 open rflfldnfl frame 21 


Oil 


15.60 


430376 


AW9Q90(;3 


Hs.1 2532 


1.00 


103.00 


430414 




Hs 19038S 


ESTs 


oso 


6.96 








ESTs 


1.00 


70.00 


430843 


AI734149 


H&l 19514 


ESTs 


too 


90.00 


430998 


AF128iB47 


Hs.204038 




0.29 


1.84 


431217 


NM 013477 


HS.25D830 


Rho 0^P2S8 sctfvsEfnQ pfol^n 6 


too 


79.00 


431921 


N46466 


Hs5B879 


ESTs 


091 


1,67 


432176 




Hs11227B 




066 


2.63 


439903 


AA3{^46 


Hs.49 




too 


7O00 


432231 


AA339977 


Hs.274127 


CLST 11240 ora^ 


0.46 


1.46 


432485 


N908fi6 


9715770 


CDW52 anfiofin /CAMPATH-I anftoen) 


079 


2.25 




D11466 


Hs.51 


nhncr^atirMifiQRllnl fllwcM_ dass A f oa 


1.93 


4.83 


4325% 


AJ224741 


Hs^7fi4B1 


r^airfli^ 3 


O04 


5.79 


432850 


)®7723 


K5.31t0 


angiotensin rBoeptor2 


too 


167.00 


HoO 100 


AR09QAQR 


He <;Q79Q 


seimphorfn seni2 


0.04 


9.16 


433553 


Arn9G37 
rat u«w^l 


Hs 277901 


ESTs 


too 


91.00 


433586 




Lie <] 94 we 




120.16 


315-00 


434445 




H3.11782 


ESTs 


0.60 


1.84 








CO 1 9« ImFcomj oin BUS tw udiKMUiiiiauvMrf 


1.00 


128.00 






Ks.37744 


Hrwnn cafMPns hnt:u1 flrfrpnAfoic racBfldfir 


too 


108.00 


436051 


AI248584 


Hs 190745 

no. iovi*tv 


Homo sapiens cDNA: R J 21326 fis, done C 


too 


91.00 


437157 


RPOdSfiSQ 

OCW'IOOUU 


no. KUOiM 


ESTs 


IJX) 


87.00 










too 


105.00 


437311 


AA370041 


Hs.9456 




too 


71.00 


437439 


K29796 


Ma9fiQR99 


ESTs 


1X0 


115.00 


438199 


AWDIBSai 

fUVIf 1 IMv 1 


Hs.122147 


ESTs 


too 


80.00 


439551 




Hs.11112 


ESTs 


030 


3.10 




AJ131245 


Hs.7239 


SEC24 /S. COeirf^) iPlatarf oflM tana 


too 


77.00 


440887 


AI799488 


kki3S9(£ 

no. I vwOUw 


ESTs 


1X0 


85.00 


4410K 


AA9138ffl) 


Hb 176379 


ESTTs 


too 


82.00 


441384 


AA447849 


H5.2S86S0 


Homo sapwns cDWA.* FU2?1fl2 fe, done H 


079 


1.^ 


441735 


AI738675 


Hs.127346 


ESTs 


too 


75.00 


442200 


AW590572 


Hs.2357^ 


ESTs 


0.76 


5.83 


442832 


AW2065S0 


Hs253569 


ESTs 


O03 


10X6 


442957 


AI949952 


Hs.49397 


ESTs 


1X0 


70.00 


443282 


T47764 


Hs.132917 


ESTs 


1X0 


197.00 


443547 


AW271273 


Hs.23767 


hyiK^ficd piotdn nJl2666 


too 


253.00 


443951 


F13272 


Hs.111334 


tennnv flQm poiypepooe 


055 


2.09 


444330 


AI597655 


Hs.49265 


ESTs 


1X0 


9O00 



PCT/US02/12476 
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PCTAJS02/12476 



444515 


AW204908 


Hs. 159979 


ESTs 


1.00 


84j00 


445769 


Af741471 


Ks.23666 


ESTs 


0LQ2 


4J8 


44S08 


R13S80 


Hs.13438 


Homo sapiens dons 24425 mRNA sequence 


1X0 


97.00 


446291 


BE397753 


Hs.14623 


kttoferon, QanuidMudble poNn 3D 


0l93 


1.69 


446917 


A1347883 


Ks.156672 


ESTs 


1.00 


106.00 


447261 


NM.006691 


Hs.17917 


extrapdhriar Iflik (kgnaiivcontainH^ 1 


0.40 


47^ 


447432 


AW95e473 


Hs.301957 


nudix (nudgoshte dqihosphalB Enked moi 


1.00 


100.00 


447482 


AB033059 


Hs.16705 


K{AA1233 protein 


<L05 


BJZl 


447997 


H0Q856 


Hs.29792 


ESTs, WeaMy similar to 138022 hypoSeS 


0.02 


5.42 


448299 


AA497D44 


Ks.2)887 


fiypoIheScal pnM FIJ10392 


1.00 


79X0 


448782 


AL0S02^ 


Hs.22039 


KIAA0758 protein 


0.42 


1.00 


4S0575 


Nlii.00S659 


H83117 


purine-rich element tindlng protein A 


0.17 


11.33 


450584 


AA04tM03 


Hs.60371 


ESTs 


1.00 


94.0U 


450593 


AW450461 


Hs.2039^ 


ESTs 


1.00 


ai.lM 


450715 


AI266484 


Hs^1570 


ESTs, WeaUy sirrdlar b KIAA1324 protein 


1X0 


152.00 


451103 


R52804 


H5^956 


DKFZP564O206 protan 


1.00 


66.00 


451220 


AF124K1 


Hs.26054 


novel SKQ-containing protein 3 


0.60 


1.30 


451668 


Z43948 


Hs^26444 


cartilaoe addle protein 1 


0.54 


1.91 


452197 


AWQ23595 


Hs.232D4d 


ESTs 


1.00 


67.00 


452331 


AAS98509 


Hs.ai17 


purtne-fich^nKnlUnding protdnA 


4.53 


11.07 


452353 


CI 8825 


Hs^191 


epiihdial ineiiibfaie protein 2 


0.72 


2.24 


453049 


BE537217 


Hs.30343 


ESTs 


1.00 


68X0 


453107 


NM_016113 


Hs^9746 


van£loid receptor-tDse prot^ 1 


0.83 


1.70 


453355 


AW29S374 


Hs.31412 


Homo sapiens cONA FU1 1 422 fis, done HE 


1.00 


132.00 








ESTs 


1.00 


72.00 


453531 


AA417940 




ESTs, WeaMy slnflar to JC5795 CDEP pnd 


i!oo 


68.00 


454741 


BE154398 




gb:CM2-HT0342-^299-05Wj05 HT0342 Homo 


0.57 


2.89 


456579 


AA287827 


Ks^20S 


up^egulated by BO&CWS 


1.00 


8ZO0 


456672 


AKD02016 


Hs.114727 


Homo sapiens, done MGC:16327, mRNA, com 


0L79 


1.96 


457400 


ARJ32906 


H&2S2549 


caOtepsin Z 


103 


3.25 


457718 


F18572 


H&22978 


ESTs. Weddy andar to AUJ4J4UMAN AUI S 


1.00 


113X0 


459S95 


F03Q27 




glKHSC1KA072 mrmallzBd tnfisnt brain cON 


1.00 


644.00 



TABI£10B 

Pk^f: Unique Eos probesetidenGSef nund)er 
CAT number. Gene duster number 
Accession: Genbanfc aocessAxi numbeis 

Picey CAT Number Acoesston 

408074 103884 1 R20723AA263003AA333976AA3347MAA334151 AVV965490AA310513AI81C530D31302AW134897AA8a^ 
C06094AW104534 

411667 1253334 1 BE160198AW935898T1152DAW935930AWB56073AW861034 

413533 1375344'l BE146973 BE145972 BE147042 BE147018 BE146783 BE147D20 BE146781 BE147019 BE146766 BE147021 BE146952 8E146767 BE147044 
BE146797 BE146776 BE146985 BE146793 BE146768 BE146771 BE146954 BE146760 BE147048 BE147025 BE147030 

423387 22779 1 AJ012074U11087L13288 X75a9L20295AW630780H1488DT28037AI872991 R72136AW449839 T81622T79W 

R73300 A1797007 R73390 AA961010 H741 68 AI689932 BE045543 AB08418 AI608912 A1806573 AW884084 AW872978 AWB729B5 AA565655 
AI022915 R50647 R73210 H45098 R46451 AW166269 T71132AI264547R52146AI304920 R73391 AW884059AW884085 H73241 T60038 
T79612 R73145 R50549 A1094557 AI668793 R72302 AI554356 W01956AM18982 W32571 R72840 H45409 R72085 R45356 R46758 
AA50ffiffiAA418798TB3rei R94072T16182AA928785AA903898 

423698 23112 1 Z92546AA330586A]570568AVV341487A1827D50AVV298668AI7g21@Al015693AI733599AI572251 AI672468AW193^ 
A1884375AI206100AA312444AI269365AI640254AW772486AI867336AA627604Hie914AA35B477AA33^ 

430212 314437J AA469153AI718503AA4692B 

436532 421802.1 AA721522AW975443T93070 

453531 97026J AM17940AA0367^T07025 

454741 1232559.1 BE154396AW817959BE154393 



TABLE IOC 



Pkar Unique number corresponding to an Eos probeset 

Ref. Sequence source. The7(SgitmjinbeisinBiisodumnareGenb3nkldenGSer(GQniinbeis. DunliamLetaL'r^tofliepubCcalionefl&iled'nieONA 

sequence of human chromosome 22.' Dunitam I. et el. Nature (1999) 402:489495. 
Strand: Indicates ETNA strand frnn which exons were precficted. 
NLpodSon In d icatBS n u deoBde positions of predicted eiMns. 



Ptey 

4tRJ754 

401045 

401083 

402474 

402808 

403021 

403421 

403438 

403687 

403764 

404277 

404288 

404394 

404518 

404916 

405106 

405257 

405381 



Rsf 

7331445 
8117619 
3242744 
7547175 
6456148 
7547270 
9665041 
9719879 
7387384 
7717105 
1834458 
27G9844 
31353(S 
8151988 
7341826 
8079395 
7329310 
6006920 



Strand 

Plus 

Plus 

Rib 

l^dinus 

MGnus 

Plus 



Phis 
Plus 



Minus 

Plus 

Mnus 

Pius 

Plus 

Minus 

F%iS 

Mnus 



Nt_posSon 
144559-144664 
9004490184.91111-91345 
33192-33360 

53526-5362B,»755-5592D.5^30-57757 

114964-115136.1154&1-115565.11S931-116047.117666.117771.118004-118102 
120799>120966 

126809-12S773.139986.14Q205 



9009-9534 
118692-11B853 
91665-91946 
3512-3691 

37121-3720537491-37762.41053411444132241593,4177341919 

84494^603 

91057-91188 

8087741418 

73121-73273 

7638W4 
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406387 92S6180 Plus 116229>t16371.1175)M1765t 
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TABLE 11A: Genes DisSnguish!ngAd8»x»dima from C3Qi8rL^ 



PCT/US02/12476 



Tahte 1U show about 84 genes uptefluiatad fa tonqadenoca ian o iaK These s^n^ were selected 

frofn aboul 59880 pro b ese b on Oie Gos/Aflyme^ Hu03 Geneddpansy. 

TabtellB show Sieacxessbnnunibefs for those Pte/stacKlngU^ Fa each prab^ to hare listed the gene duster lunterfn^ 

ofigonudsoSctesweie designed. Gene dusters were oompOed using sequences derived finmGenbankESTsan^ These sequences veiedustsred based on sequence 
slraBaityus^QustBimg and AQgninent Tools (Doi^^ The Genbattk accession numbers far sequences comprising e^ 

'Aooes^cdumn. 

Tabte lie shov Die genonic portioning for those Pk^laddngUni^ For each predidedexan. we have Osted (he genomic 

sequence souroe ussJ for predfcfion. Nudeotids bcsSons of esch predictBd exon are also listed. 

Pkey: Unique Eos probeset identifier number 

ExAficni Eicempbr Aocessjoo numberi Genbank accession number 

UnlgenelOt Ib^ens number 

UnS^neTlSe: UnigenegeneQia 

R1 : Average of lung tumors (including squamous ceO carcinomas, adenocarcinomas, small cell cardnonos, granulomatous and cardndd tumors) divided by the 
average of norma! lung samples 

R2: Average of non-maDgnant lung (fiseasesamptes (induding brondiilis, emphysema fibrosis, atelectasis, asthma) divided by the average of nomidfajng sanqdes 



Ftey 

403329 

408399 



407881 



409103 
409187 
409269 
410076 
410102 
410399 
411908 
412612 
414075 
416208 
417542 
419183 
41S502 
419831 
420931 
421155 
421190 
421474 
421515 
421582 
422028 
422095 
422311 
422867 
423472 
423554 
424502 
42454^ 
424905 
424980 
425523 
426230 
427701 



ExAocn 



M29540 

AI827976 

AW072003 

BE296227 

AF251237 

AF154830 

AA576953 

T05387 

AW248508 



U7943 

KM_000047 

U11862 

AW291168 

J04129 



428758 
429170 
429263 
429610 
430508 
430985 
431548 
431558 
431986 
432375 
432677 
433556 
433819 
434001 
434424 
434792 
436217 
436749 
438972 
437B66 
437935 
438915 
439451 



AU075704 

AW188117 

AF044197 

H87879 

U95031 

U76362 

Y11339 

A!910275 

U80736 

AI858872 

ARI73515 

1^37 

AF041260 

M90516 

AF242388 

M88700 

t^002497 

BE245380 

ABW7948 

AA367019 

AM11101 

AB007863 

AA433988 

NM.001394 

AA019004 

AB024937 

AI015435 

AA49Q232 

At834273 

AF176012 

AA538130 

8£53ffl69 

NiyiL004482 

W56321 

AW511097 

AWg50905 

AI811202 

AA649253 

T53325 

AA584890 

AA284679 

AA156781 

AW939591 

AA280174 

AF088270 



UnigenelD 



Hs.220529 

Hs^4391 

Hs.40g68 

Hs.250822 

Hs.112208 

Hs^0966 

HS.2B72 

Hs.7991 

H5279727 

Hs.72924 
Hs.74131 
Hs.75741 
H8.41295 
Hs.82289 
Hs.8g663 

H5^154 
Hs.100431 
Hs.102267 
Hs.102482 
Hs.104637 
Hs.105352 

Hs.110828 

Hs^2804 

Hs.1 14948 

Hs.1584 

Hs.129057 

Hs.1674 

Hs.149585 

Hs.150403 

Hs.153704 

Hs.153952 

Hs.158244 

Hs^413» 

H8JS43B88 

Hs.185140 

HS.985Q2 

Ks^ 

HB.1983S6 

Hsi11092 

Hs.104637 

Hs.27323 

Hs.9711 

Hs^0720 

Hs.149018 

Hs.2962 

1^8611 

Hs.1 11460 

Hs.112765 

Hs.3697 

Hi325335 

H5.132458 

Hs.107 

Hs^02 

Hs.25640 

Hsj940 

H5.285681 

H&278554 



UnigeneTiQe 
Target Exon 

NM.003122''i4omo sapiens serine protease 
cardnoembryonic snGgen^aled ceS ad 
hypoth^prtMFU13612 
heparan sitfats (glucosamine} 3-0«u^ 
serine/ihreonlne kinase 15 
X^GE-I proton 

carbamoyi-phosphate synthetase 1. nMi 

hypoMcai prdteSn FU13352 

ESTs 

Homo sapiens cONA FU 14035 (is. dam HE 
synudebit gamma (tveasl canoer-spedfic 
cyfidine deaminase 

aiylsultatase E (chondrodysplasia puncta 
amBoride Ixnding protein 1 (amine oxida 
ESTs, VteaMy sjmter to MUCgJjUMAN mm 
progestagen-assodaled endomstrial prole 
cytodtrome P450. subfamOy XXIV{viiaRin 
fibrinogen, A atpha pdypepGde 
popeye protein 3 

sm^ Inducible cytokine B subfamily (Cy 
lysyl oxidase 

mudn 5, suldype B, badieobronchld 
solute carrier family 1 (gtutamate trans 
GalNAc ^}pha-2, &«ia}yttrBnsfsr35e I, i 
trefoil tiactor 1 (breast cancer, estroge 
trtnudeoQde repeat ccnteiriT^ 9 
hypothetical protdn FU22704 
cytoldne rec^)tor'-&)ce tactor 1 
carfls^e dlgomeric matrix protein (pse 
breast cardnoma amplified sequence 1 
gtutandn^-^uctosfr-O^iihosphatetransandn 
lengsin 

dopa deca rt xixyiasB (BiontaSc L-ardno ad 
NIMA (never tn mitosis gene aHdaled 1[ 
5'nucteo6da5e(CD73) 
KIAA0479 protein 
protease, serine. 1 (trypsin 1) 
nudear autoan^gerdc spenn protein (his 
K1AA0403 protein 
hypoiheGca) protein FU14303 
dual specSdty phosphatase 4 
ATP-binding cass^ sub^amly A (ABC1 
LUNX protein: PLUNC (patate lung and nas 
ESTs 

ESTs, We£My slmBar to 176885 serineffh 

iKJvel protein 

J domain conlairBng pTOit^ 1 

Novel human gene mapfnne to chomosome 20 

S100 caldunvtinding protein P 

UOP-N-acetyl-alpha-O-galactosaminerpolyp 

caldum/tcahnodulln-dependent protein Idn 

ESTs 

serfne (or cysteine] proteinsetohibilD 
hfomo sapiens cDNA: FU23S23 fiSk done L 
ESTs 

fit]iinogen-l3(B 1 

lecfin, gatadodde^indln^ sotul^ 4 
daudbi3 

metaDoOiiondn lEffuncfional) 
Rudn 13^ epiBiaBal tiaiiuiieinbrane 
VtfilBaiis-Betnen syndrame dvofnosome re^ji 
h^eiDChnonsffiiKfike pratdn 1 



R1 


R2 


1.00 


61.00 


1.00 


39.00 


226.37 


350.00 


0.77 


1.18 


1.00 


10.00 


7.76 


1.00 


BOM 


40.00 


1.(K) 


1,00 


1.00 


1.00 


1.12 


1.50 


9JB3 


i!oo 


0.92 


1.06 


1.00 


1.00 


1.02 


1.03 


0.84 


1.07 


3.67 


1.00 


1.28 


135 


1.00 


1.00 


13.05 


115.00 


1.00 


13.00 


1.00 


8.00 


i!oo 


15.00 


1.17 


1.55 


1.46 


1.76 


1.00 


3.00 


1.23 


1.00 


1.00 


52.00 


4.37 


2,34 


1.15 


1.78 


1.69 


3.17 


48.13 


72.00 


1.00 


50.00 


1.00 


1.00 


i!oo 


59.00 


21.35 


1.00 


1.00 


1.00 


1.00 


35.00 


1.00 


83.00 


7.41 


34.00 


1.00 


6.00 


1.08 


1.13 


16.18 


105.00 


1.07 




159 


1.69 


4.75 


7.27 


0.94 


1.28 


5i66 


15.00 


49.76 


37.00 


1.19 


1.47 


1.65 


1.06 


1.0) 


48.00 


1.00 


19.00 


171 


aoo 


29.31 


7ZO0 


1.00 


64.00 


8.52 


44.00 


57 J7 


31.00 


1.10 


1.41 


1.59 


1.46 


a&2 


101.00 


1.60 


1^ 


1.00 


1X0 


2128 


52.00 
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439759 


AL3S90S5 


HsJSITOB 


Homo s^itens fnRNA fuO lengib insert cON 


1X0 


21.00 


441031 


Af 110684 


K5.7645 


fiirinogc^ 6 beta poiypep&id 


1.41 


99j00 


441377 


BE218239 


HS.2026S6 


ESTs 


22.03 


1.00 


443S14 


AV655386 


Hs,7845 


Gbrinogen* B beta poiypepficto 


1.00 


16.00 


443813 


AA876372 


H5^3^1 


HofTO sa]^ cDNA OKFZp667D095 (fr 


1.20 


1.99 


443991 


NM_002^ 


Hs.10082 


polasshim intefinetfiate^smd conductanoe 


5.71 


6.87 


444670 


H58373 


Ks.332938 


hypolheSs^ protei) MtGC5370 


1.98 


33.00 


444931 


AV652066 


Hs.75113 


Qeneral {ranscrip&n factor fllA 


1.00 


54.00 


4461Q2 


AW168067 


HsJ17694 


ESTs 


1.00 


1,W) 


446163 


AA026B80 




Homo safnsns cONA FU1 3603 Gs, dons PL 


1.00 


35iJ0 


446469 


BE(S4848 


Hs.15113 


homogentisate 1»2'dio}vg8nase (tomogenS 


1.00 


11.00 


447360 


AW^0534 


H&76277 


Homo sapiens, done MQ&9381, mRN^ comp 


1.24 


1.16 


44^32 


AKD00614 


Hs.18791 


hypoineaca protein FU20607 


1.23 


1.63 


446243 


AW3o9771 




iraegnn. oeta 0 


id.o4 


I.UU 


448844 


A1581519 


Hs.177164 


ESTs 


1.00 


31.00 


449444 


AW818436 


H&23990 


sduls carrier tawiDy 16 (monocaiboiQitlc 


1.00 


83.00 


451807 


W52854 




hypolheScal protein FU23S3 sinflar to 


1.55 


35.00 


Asm 


F33868 


H&^84176 


transferrin 


1^ 


1.44 


453392 


U23752 


(fe^2964 


SRY (sex detsnnining region Y}4]<» 11 


• 1.00 


16.00 


453464 


A16B4911 


Hs32989 


receptor (cdcSorin) 3c£vi^ modifying 


1.55 


145 


453735 


AI066629 


Hs.125073 


ESTs 


1.01 


1.30 



TABLE 11B 

PkBf Un]qtr8Eosprot)esetidenfifierffliRiber 
CAT number Gene duster number 
Accassioa: Genbank accession numbers 

Pkey CATNumber Accession 

410399 11995 V BE068889BE068882AR)44311 AiH)17256mL003067AI^37207AF010126AA633976AA872836BE298825BE 

Ai936527 AA804675 AA394097 AI139933 AA946^ BE171 313 AA722407 AA293803 A)468460 AA0a6035 AA055968 AW796957 AI63ni3 
AA410737 H49348AA486472AA411094AA235594AA4Q2624 AA443638AW452137 AA421708AVV^5211 A149326^ 

419502 18535 1 AU076704T74854 T74860T72098 T73265T73873T69180T7465BT58788 T60385T73410T68781T6784ST67593 T73952T^ 

T68367 T68401 T53959 T72360 T72099 T60377 T58961 T71712 T72B21 T64738 T74645 T72037 T68688 T72063 T73258 T72826 T64242 
T6B220T74673T71800T68355T61227T62738T69317 T53850 T64892 T7376BT73962 T73382T68914 T70975T734MT606^^ 
T73203 T70498 T61409 T58925 NM.000908 liffi4982 TB8301 T73729 T69445 T60424 T67922 T67736 T6B716 T67755 T74765 r738ig T58719 
T74756 160477 T74883 T61 109 T68329 T58B50 T71857 T73425 T53736 T68807 T5B898 T64309 T72031 T72079 T54305 T71 908 T6ei 07 
T71916 T73787 T56035 T64425 T71870 T60476 T61 376 T67820 T71B95 T41005 T69441 T68170 T74617 T7195B T69440 T61875 R05796 
H48353 T71914 T53939T64121 AA693996 T72525 T67779 766078 AA01 1465 AA34537fl AV654847 AV554272AVe56001 Ai064740 T82897 
N33594 AA344542 AVtf8050S4 AI207457 T61743 AA026737 H94389 AA38269S AA918409 T6804^ 

AA312919T4D158 H66239 AV652989 H38728 R98S21 AV655200 RK790 W032SD WD0913 AA344136 AV660126 R97923 AA343596 
AW470774AV651256 N54417AA812862AW1B2929A}111192H61463 H72060AA344503 H38639A{277511AV561108Ai2 
AA235252 T27B53 T47778 R95746 H70520 AA701463 AW827166 R9B475 C20925 AV657287 T71959 T7131 3 T73920 T73333 T61 618 T69293 
T69283 T73931 n21 78 T72456 AV645639 AV653476 T72957 T72300 T58906 T71457 T70494 n2956 T70495 T68257 T74407 T85778 
AA344726 T27854 T74465 T74101 T73868 T71518 T72304 AA343B53 T73909 T68070 T72065 H72149 T73493 T73495 AV645g93 R02293 
T70475 T64751 AA344441 AA343657 AA345732 AA344328 All 10639 AA344603 AF063513 T64698 T68S16 T72223 T60S07 T67633 R29500 
T72517 R02292 T60599 T59206 T70452 T74677 R29366 T61277 T74914 T60352 R29675 T74843 AV645792 AA344408 T69197 T72057 
T69368 T59358 T68256 AV650429 T73341 T61702 T74598 T40095 KD2272 T40105 AA343045 AA341 908 AA341907 AA342807 AA341 964 
T53747 T72042 T62754 A1054899 AA343080 T57832 T72440 T71 770 T68091 T69108 T72449 T69157 T712B9 T68251 AVB54844 T64375 
AA345234 T67598 AA011414 T680^ H48262 A1207557 TB8219 W85031 T69081 T64232 R93196 T82138 AV650539 H67459 T72978 
AA344583 T60362 H58121 T95711 T72803 T68055T71715R29036n2793T69122T64595T528B8T69139 T68291 T64552 T67971 T46B62 
AA693592 AI248502 R29454 T64764 T57001 T73052 T714a T51176 TK866 AV655414 H90426 AA342489 773666 T67848 T72512 T53835 
T67837 T7331 7 T74273 T69420 T68245 T74380 T67862 T74474 T56068 

421582 2041J A1910Z75 X00474 X52003 X05030 NM_003225 AA314326 AA308400 AA5067B7 AA314825 A1571948 AA507595 AA614579 AA587613 F»3818 

AA556312 AA614409 AA^578 A)925552 AW950155 A!910083 M12075 BE074052 AW004668 AA578674 AA5820S4 BE074053 BH074126 
BE074140 AA514776 AA5B8034 BE074051 BB}74068 AW00976g AW0S0690 AA658276 R55389 AfOOIOSI AW0S07DO AW750216 AA614539 
BE074O45 A1307407AW802303BE073575 AI202532 AA524242AI9708^ A1909751 BE076078 AI909749 R55292 

437886 44433J AA156781 AW293B39 U52054 AA024963 AA778446 BE073977 AW444904 AW502S74 BE1 64040 BE164012 BE163972 BE163974 BE163992 

AA837481 AW468444 BEt85091 AW4660Q2AA687333 AAB1 1830AA581806 A1666686 AI572124 AA043777 AA040926 D20160Ai536733 
AA812489 AW874142A)471863 W84421 AA156650 

451807 8865.1 ^52854 AL1 17600 8E2081 16 8E208432 BE206239 BE082291 AW953423 AA351619 BE180648 BE140560 W800B0 AA865478 N9Q291 

AW450852AW449519AA993634At80^AA351618AVIM49522Al82?826AA904768AA38038lAA 

TABLE lie 



Pkey: llnfa|ue numter conesponding to an Eos proteset 

Rsfc SoQuence souroe. The 7 (Q^ numbss In tNs oobiinn are Genbailcldenlifier (GQ nunlbefs. "Durdiam L etsL" refers to (he pufaBcadion mtOied The DMA 

sequence of human chromosome 22.' Dunham LetaL, Mature (19^ 4Q248949& 
Strarat ti)(Iicates DIM straitd from wtiich axons were predicted. 
NUiosiSon: bxficatesnudeoSde positions of predicted axons. 



Play 
403329 



Ref 

8516120 



Strand NCposiGon 
Plus 96450-96598 
hSm 63448^554 
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TABLE 12A: Genes ESs&igiQstdng Squannus CeO Carcm 

TaUe 12A shoas about 72 genes upregidated b squamous care^^ 
wm seeled from abod 59580 probssete on the Eos/Affyme^ 

Tablo12B show file accession numbers for those Pkeysladd^ Rjr each probeset we have Gsfed the gene chister number bom 

a(iBonudeo6desweiedeagned. Gene dtBtafsvm compiled using sequences deiMlnm These sequences were dusteed based on sequence 

simbi^ using Cheering and AlispfiniemToob(Do^^ The Genbaiikaooestemtnibers (or sequences comprising each cluster are Ds^ 

'Accession' cohnm. 

Table 12C show the genornic positioning (or those P)(8y%Iacld^ For each predkdedexon. we have listed the genonuc 

sequence source used for predteOon. Nudeotide locations of each predUedexon are aisoisted. 



Picey: Unique Eos probeset identifier nunAer 

ExAccn: Exeirq}Iar Accession number, Genbaidc accessioo number 

UnigenelD: Unig^ number 

Unigene THte Unigens gene title 



R1: 



R2: 



Average of king tumors fmduding squamous cell carcinomas, adsnocaRxumas, small cell carcinomas, granulomatous and cananoid tumors) divided by the 
average of nomial lung samples 

Average of noTHnaiignanl iung disease samples (bduding bnnchffis. emp^ 



Pitejr 


ExAocn 


UnigenelO 


Unigene 1% 


R1 


R2 


400289 


X07620 


H5.2258 


notrix metaBoprtieinase 10 (stromdysin 


132.45 


4.00 


400666 






NM_002425:Homo sa;sens matrix metaliopre 


3.26 


122 


401780 






NhLOOSSSriHomo sapiens keratin 16 (foca 


26.47 


1030 


401781 






Target Exon 


ia33 


4.61 


401785 






NM.002275*iiomo8apiensker^n15(KRT1 


4.13 


Z70 


401994 






Target Exon 


61.84 


47.00 


402075 






EN^000002S1056*Plasn» membrane calcium 


1.00 


1.00 


404996 






Target Exon 


1.00 


1.00 


407839 


AA045144 


Hs.161556 


ESTs 


173.91 


loaoo 


408000 


LI 1690 


Hs.620 


buQous pemphigoU anfigen 1 {230/240kp) 


151.17 


m 


408522 


AI541214 


HS.4S320 


Sm^ pralne^lch piot^ SPf^ (human, 


1.98 


1.24 


410561 


BE54Q255 


HSJ6994 


Homo sapiens cOKA: RJ22044 fo, dons H 


10.04 


1.00 


415091 


AL044872 


Hs.77910 


3^rexy-3-methyl^utary(-Coenzyme A sy 


1.00 


30.00 


415B17 


U88967 


Hs.78867 


protein tyrosine phosphatase, receplor-t 


24.30 


1.00 


416658 


U03272 


Hs.79432 


fibrilKn 2 (cortgenltal contradurat ara 
neurotensin 


53.29 


51.00 


417034 


NML006183 


Hs^2 


1.00 


1.00 


417366 


BE185289 


Hs,1076 


smaO proSne-rtch protein IB (comiiin) 


6.97 




416663 


AK001100 


Hs.41690 


desmoctdin 3 


112.17 


19.00 


418678 


NM_001327 


Hs.87225 


caix^AesSs antigen 


1.18 


1.10 


419121 


AA374372 


K&89626 


parathyroid homune^ikB homnne 


t.00 


1.00 


420783 


AI659836 


Hs.99923 


iecQn, galactoskte-bindhig, soluble, 7 


ao4 


1^ 


421773 


W69233 


Hs.112457 


ESTs 


1.12 


1.14 


421948 


L42563 


Hs.334309 


keratinSA 


51.83 


20.25 


421978 


AJ243662 


Hs.110196 


NlCE-1 proton 


1.01 


0.91 


422158 


L10343 


Hs,112341 


protease inhibitor 3, sUn^leilved (SKAL 


zzr 


1.10 


422440 


NM.004812 


Hs,116724 


aido-lceto reductase My 1, member BIO 


47.53 


3Z00 


423634 


AW9S9908 


Hs.1690 


heparin-birufing growth factor binding pr 


76.02 


1.00 


423725 


AJ403108 


H8.132127 


hypotheOcal protein ljOCS7822 


4.20 


1.00 


423738 


AB002134 


Hs.132195 


dnvay tjypsin-Dke prolBase 


10.14 


5130 


424012 


AW368377 


Hs.137559 


tumor proti^ 63 kOawBh strong homotog 


233.42 


68.00 


424046 


AF027856 


Hs.138202 


ser^ (or QfstQne) protease inhSAo 


1.00 


1.00 


424098 


AF077374 


Hs.139322 


snoD pro6ne-rich prot^ 3 


137.82 


5430 


424B34 


AKD01432 


H&153408 


Homo sapiens cDNA FU 10570 fis, dons NT 


56.19 


1230 


425650 


NM.001944 


Hs.1925 


desmogiein 3 (pemphigus vulgaris antigen 


33.45 


1.00 


427099 


AB032953 


Hs.173560 


odd Oz/ten-m homolog 2 (DrosophSa, mous 


4.24 


17,00 


427335 


AA448542 


Hs.251677 


6an6gen7B 


5133 


4.00 


428182 


BE386a42 


Hs^17 


ESTs. Weak^ slmSar to GGC1.HUMAN 6 ANT 


1.00 


1.00 


428645 


AA4314Q0 


H5.98729 


ESTs. Weddy similar to 201 7205A (fihydre 


1.00 


1630 


428748 


AV\593206 


HS38785 


Ksp37 pmteln 


1.00 


8730 


429259 


AA420450 


Hs^ll 


ESTs. Highly mia to S60712 band^ 


2.01 


1.18 


429538 


BE182592 


Hs,11261 


sm^ profine-rich protdn 2A 


4.43 


Z90 


429903 


AL134197 


Hs.93597 


cycDnKiependent kinase 5, regutalofy su 


11.80 


1.00 


430486 


BE062109 


Hs^41551 


chloiide channel, cddum acOvaled, fam 


12^ 


41.00 


430890 


X54232 


Hs^ 


glypican 1 


158 


1.40 


431009 


BE149762 


Hs.48956 


gap juruiSon pre^n. kieta 6 (connexin 3 


60^ 


28.00 


431846 


BE019924 


Hs^l^ 


UDpdakln IB 


4.49 


2.51 


433091 


Y12642 


Hs^185 


lymphocyte entigen 6 onnploc, locus D 


1.20 


1.09 


434360 


AW015415 


Hs.1 27780 


ESTs 


4038 


27.00 


434860 


U02388 


H5.101 


cytochrome PASO, subfamily IVF. polypept 


1.00 


m 


435505 


AF200492 


HS211238 


inleiteuldn-1 homdog 1 


1.00 


3830 


435793 


AB037734 


K&4^ 


KIAA1313 protein 


2338 


42.W 


43S11 


AA721252 


Hs^1502 


ESTs 


16.76 


14.00 


438403 


AA806607 


Hs^92206 


ESTs 


1.00 


1.00 


439285 


A1133916 




hypothefical protdn FIJ2{X}93 


46.23 


139.00 


439608 


W79123 


Hs^l 


G prrtfiftwpupted mentor 87 


3331 


1.00 


439670 


AF088076 


HS59507 


ESTs. Weddy simlsr to AC004856 3 U1 sm 


1.00 


1.00 


439706 


AW872527 


Hs^9761 


ESTs. Wea%siniarto DAP1 DEATH 


8635 


1130 


440325 


NM.003812 


Hs.7164 


a disintegnn aral mstdk^pratdnase doma 


62.88 


147.00 


441525 


AW241867 


Hs.127728 


ESTs 


133 


1.42 


443162 


T49951 


H55029 


DXFZP434G032 proteki 


31.11 


3830 


444378 


R41339 


Hs.1^ 


ESTs 


1.00 


130 
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446292 


Arvo»497 




Rh ^^^^ 0 Qtycopyotfi'^ 


447078 


AW885727 




ESTs 


447342 






Homo saplenSt Sirilar to RIKEN cONA 2010 


449003 


X76342 


HsJ89 


steotol (fehydrogenase 7 (d3s$ tV), imi o 


449101 


AA205847 


Ks^3016 


G pnrisin-coupisd receptor 


450832 


AW970602 


Hs.105421 


ESTs 


4S2240 


AIS91147 


Hs,61232 


ESTs 


453317 


NM-002277 


Ks.4136 


Keratin, hair, adtficl 


453830 


AA534298 


Hs.20d53 


ESTs 


454098 


WZ7953 


Hs^2911 


ESTs. Highly sinilar to S6071 2 band^ 


455601 


M368680 


Hs^ie 


SRY (sex detenmniiig region YHiQX 2 



TABLE 12B 



Pksy: Uiuqua Cos prabesetidenSfier number 
CAT number Gene duster number 
Accession: GenbeiA acc es aon numbers 



PCT/US02/12476 



1.55 


1.26 


4724 


24XJ0 


28.63 


1.00 


1.00 


1.00 


ZSB 


27.00 


25,17 


3S.00 


13.42 


1.00 


1.19 


1.27 


24.92 


2&00 


1.26 


1.11 


206.11 


1.00 



PIcey CAT Number Accessbn 

439285 47065 1 AL133916IO9113AF086101 N76721 AVV95082BAA364013AW3S5664AI346341 AI867454h^ 

AA7755S2N62351 N59253AA626243A1341407BE175639AA45696BAI3589iaAA457077 



TABLE 120 



Pk^ UniquBnumbercorresponcfingbanEospJobesdt 

Rsf: Salience source. The7digilffl]mbeis!nihbcotumnareGenbanHld8nto((^num^ 'Ounhaml.elal'refieistothepubGcaQonenfiQed'nttDNA 

sequence ofhuman chromosome 22.' Dunham LetaL, Nature {1999) 402:48^95. 
Strand: indicates DNA strand ton which exons were predicted. 
NLpt^Bom Indicates nucleotide poslSons of prec&itedexons. 

MLposfiion 

17982-18116.20297-20458 

28397-28617,28920.29045.29135-29296.29411.29567.29705^9787,302243(873 
B3215-83435.83531-83656.8374O^1,84237-84393.84955^7,86290^14 
165776-165996,166189-1663l4.16640d-l66569,l67l12-167268.167387.167469.166634-168942 
42904-43124,43211-43338,44607^763,4519^5281.46337^32 
121907-1220%122804-122921.124019-124161.124455-1246iai2S672-126076 
37999^145^38652.36998.39727.39872.4055740674,42351-42450 



PkBy 


Ref 


Strand 


400666 


8118496 


Ptus 


401780 


7249190 


Minus 


401781 


7249190 


Minus 


401785 


7249190 


Minus 


401994 


4153858 


Minus 


4D2075 


8117407 


Rus 


404996 


6007890 


Plus 
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TABLE 13A: Genes OsSnguishing hbo-ftferignanl Umg Diseasa ton Lung Tomors and Nomia) kmg 

TsUs 12A5hoBrs8bout239enssupregulatedtonofHnaf^nanthB9cSsea6er^^ 
Oie EosMflymdrix Hii03 Gemchjp snay. 

Tafato 13B show the anessbnnumbeisfy those Pkey'slac^ For each probesdtm have listed the gene duster nuntefiom 

oBgonudeoGdes were designed. Gene (tistsis were oanjriled using sequences deiM These sequences were dusteredtBsed on sequence 

similarity iBingQusteivig and AIignnientTocis(DoublBTi^ The Genttank accession rontosforsequenoescanipri^eadi duster a^ 

'AccesstonT cobonn. 

TabteISC show the genonwpositbning for tee Pke/siack^ For each predictBdexon,«er.h3tf8listBd the genomic 

sequence source used for prediction. NudeoSdelocsSonsofeach predicted exon are also listed. 



Pksy: Unique Eos probesetidenGfiar number 

ExAccn: Exemplar Accessbnnurhber,Geritenl(aocesston number 

UnigenelD: Unigene number 

UntgeneTOe: Unigene gene title 

Rl: Aveiass of king tumors (btthid&ig squamous oefl csidiiORiaSi sdsnocercinQmas, smaO oefl cardnomas, granulomaioos aid cardndd tumors) divided liy the 

average cS nornBl hing sentries 

R2 Aver^^ of rofHnaOgnantiung (fisease samples finduding brorxdiffiSkempI 





ExAocn 


UrdgenelD 


Unigene Tiila 


Rl 


R2 


408562 


AI436323 


Hs.31141 


Homo sapiens mRNA for KIAA1 568 pnlein, 


1.00 


23aoo 


409031 


AA376836 


Hs.76728 


ESTs 


1.00 


128.00 


412372 


1^85998 


HS.2B5243 


hypotheOca! protein FLJ22029 


1.00 


17100 


415910 


U20350 


HS.7B913 


chemoiune (C-X3^ receptor 1 


1.00 


145.00 


417511 


AU)49176 


Hs.82223 


chordiMDce 


1.00 


iTaoo 


418819 


AA228776 


H8.191721 


ESTs 


1,00 


140.00 


422060 


R20893 


Hs.325823 


ESTs, Moderately sirdlar to ALU5J{U1MANA 


1.00 


156.00 


424565 


AA464840 


Ks.131987 


ESTs 


1.00 


157.00 


426753 


T89832 


Hs.170278 


ESTs 


1.00 


141.00 


429496 


AA453800 


Hs,192793 


ESTs 


1X10 


moo 


430719 


AA4889B8 


Hs.293796 


ESTs 


1.00 


133.00 


431089 


8E041395 




ESTs. Weaidy simSar to unloiawn prolain 


23.32 


941.00 


431385 


BE178536 


Hs.11090 


msmbrane^spanning 44lomaln5t sid^lBrni^'A 


1.00 


157.00 


431728 


NNL007351 


Hs.268107 


rnuiOmerin 


1.00 


157.00 


436532 


AA721522 




gb3ivS4h12j1 NCLCGAP^ Homo sapiens 


1.00 


218.00 


437960 


Al^9586 


Hs222194 


ESTs 


1.00 


147.00 


438202 


AW169287 


Hs.22588 


ESTs 


1.00 


141.00 


441499 


AW298235 


Hs.101689 


ESTs 


1.00 


167.00 


444513 


AL120214 


Hs,7117 


gtulam^ receptor, ionotroiA*, AMPA 1 


1.00 


151.00 


448253 


H25899 


Hs.201591 


ESTs 


1.(X) 


141,00 


453S35 


R67837 


Hs.169872 


ESTs 


1.00 


11&00 


458332 


AI000341 


Hs.220491 


ESTs 


1.00 


19100 


459587 


AAOSISSO 




gtezlc15e04.8l Soares_pregnantJUtenisJ{bH 


1.0D 


15400 



TABLE 13B 



Pkey: Unique Eos prabeset (denser numtier 
CAT number Gene duster number 
Accession: Genbank aooessfon numbers 

Pksy CAT Number Accession 

431089 327825J BE041395AM91826AA621946AA71S980AA666102 
436532 421802.1 AA721522AW975443T93070 



TABLE 130 

PIcey: Unique number corre s ponding to an Eos probeset 

Rs^ Seqiisnossourca The 7 d^gtt numbers in this ooiumn are Genbanl(ldenffier(GI) numbers. Dunham LetaL'iefsre to the pubBcaBonenffled The DMA 

sequence of human chromosome 2Z' Dunham L et ai^ Nature 0^ 402:489^95. 

Strand: indicates DNA strand &omwMdiexare were predidBd. 

m p^^^fp'i Indicates nudeoBdfi podfiocis of predictBd cxofis. 

PkBf Raf Strand fftjMsffion 

402075 8117407 Rus 121907-122035,122804.122921,124010-124161.124455.1246iai25672-126076 
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TABLE 14A: Preferred UiBIyarK) Subcellular LocaD^^ 

Tati8l4As)KJw5 the subceh^loca&zaSon and prefer mAbsymboBzesmonockmaianSbady.diagsyn^ 
diaonosGc, S.IIL synrfnSzes sneD nnta^ 
Genechip air^. 

TaUe 148 show iheacoesskmnumbeis for UiosePkey'sladdn^ For each probesettra have Gsted the gene cbsternuniberfro^ ' 

oQgonudi»(^ were designed Gene dus^vm compiled usmgs&tusnoesfk^ The» sequences were ckdsred based on sequence 

siinilaiity usbig Chstoing and ABgnmenl Toob 
'Aooessiofl'ootom. 

Table UCdioiir the gsnonocposlSndnglbr those Fveadi predicted exon,vo have Dsled the genomic 

sequence source used fiorpredicEon. N u cteoB d e bcatons of each piatBclBd exon are afeo Bsted. 



Pkeyr UrdqueEosprohesetideol&rnumter 

BcAocn: Exeroplar Acces si on nuwtwr, Genfaank accession number 

Un^enalOt Umgene number 

Unigene Tttia: Unigene gene Qle 

PtAMf. Prsferod USSiy 

F^edJjoe Predided subcdhdar locaOzafion 



Ptey 


ExAocn 


Un^ensID 


Unigene TISe 


Pref UG^ 


Pred. Loc 


400259 


X07820 


Hs.2258 


matrix meiaSoproteinase 10 (sbomelysin 


mAb&disg&sja 




400303 


AA24275a 


Hs.79136 


UV-I protan. estrogen regulated 


mAb 




402075 






ENSP00000251056':P)3sma membrane calcium mAb & (Sag 


secr^ed 


407811 


AW1 90902 


Ks.40098 




di^ 




40^43 


Y00787 


H5.624 


intefteulun 8 


dl£ig 


secreted 


408790 


AW58Q227 


Ks.47860 


neumtrophic ^frosine Idnase, rscepHoTi 


mADaSJiL 


plasma iiuiiilxaiie 


408908 


BES6227 


Hs^50822 


senne/uueonine KmasB is 


sjn. 


cytoplasm 


409041 


AB033025 


Hs.50a81 


nypoineiicai pnnein, XtJUdiodu (naai 19 


biL&aag 


OSvlvlCU 


409103 


AF251237 


Hs.112208 


XAGcO protetn 


CTL 




409420 


Z15008 




lananin. ganuna i (nicen iiuokuJi Kain 


(Sag 




409832 


W74001 


Hs^577S 


serum yx (^wuie/ proiBinaso uiiuinui 


(Bag 




409757 


nivi^uv I090 


ns. izo 1 14 


cysiaun on 

minichiDmosoine m^nienance de&^t {S. 


QtSy 




409893 




ns.o/ ivi 


CTL 


mxdear 


409956 




Hs.727 


inniDini oeia a (acuvin Ai acuvm ad a 


oiag 


extraoeOular 


410001 


AB041(^ 


Hs.57771 


KBUUllWl 11 


(BS9 


exlAJwIluteif 


410407 


XB6839 


kfe 63287 


Aarlvmir snhuriraeo lY 
caiuonic Bni^fWdse Ia 


nnM)&sjn. 


pla&nQ membrane 


410418 


D31382 




bansmernbrsne protease serine 4 


innO at Dm^ ft SJlk 

Bin. 


plasma meintjrane 


412140 


AA2igfi91 


Hs.73625 


KAoD imeracung, nnesoHiXB (raonnes 




412719 


AVUD1661D 


Ns.816 


CO IS 


SiHL 


nudear 


414774 


XQ2419 


Hs.77274 


ptasminogen acSvatOTi uraldnse 


(&Q 


sxIraBenular 


414683 


AA926960 




/'M'W"*Oft ^iftloirt Lliiaeft A 
\A^kjiJo\K\j}BU\ KBlsSe 1 


SJTL 




415138 


CI 8356 


Hs.29^44 


ussue lactor paunray nruDHor i 


CTL & (Sag 




415669 


NM 005025 


H&.78589 


senno lor cyssinef proioindse miuunD 


m/^ & diag & s Jn. 




415817 




ns.f ooof 


protein ^nosine phosphatase, leceptor-t 


mM)&sjiL 


plasma msnofsne 


416658 


U03272 


Hs.79432 


uonufli z |Cungeiiu3 ouiiuduuira ara 


(Dag 


extracelhJlar 


417034 


NM 006183 




neuroten^ 


(nag 


extracellular 


417079 


U65590 


Hs81134 


ifuBneuxin i receptor aniagonisi 


diag 




417308 


rtouf m 




ru/vwiui geno pruauci 


sjn. 


mitochondrial 


417389 


BE260964 


Hs 82045 


nudkfaie (neurite growttv-pRHnofing factor 


mAb & (flag 


secreted 


417433 


BE270266 


Hs8212fi 

n9.u& 1 CO 


5T4 oncofetai trophoUast glycopiolOin 


mAb 


piasrna memtvane 


417933 


X02308 




fliymidylale synthetase 


sjn. 


BnUOpiaSllllw ICUCUlUin 


418478 


U%945 


H8.1174 


QfcSn-<t6pendent kinase trtfiibfior 2A (me 


Sin. 


cytoplasm 


418508 


AA084248 


HSJ5339 


G prole!iv«ouided receptor 39 


RiAb&sin, 


plasma memtirane 


418678 


NM_001327 


H8.167379 


cancerAesBs anSgen (NY-£S0-1) 


CTL 


(»ytofriasmic 


419121 


AA374372 


H5.89626 


paraihyrcsd honrane-Cte honnone 


diag 


secrtied 


419171 


NM_002846 


Hs^655 


prot^ tyrosine phosphatase, receptor t 


mAb&8.m. 


plasma memlxane 


419183 


U60669 


Hs.89663 


cytochnme P450, subfamily XXIV (vitamin 


CTL&sin. 


mioxsKiiuinai 


419216 


AU076718 


Hs.164021 


smaO iridudble cytol(bte subfamily 6 (Cy 




secreted 


419235 


AW470411 


H5^33 


neurotrinw 


mAb&diag 


plasma rr^tirane 


419452 


U33635 


^.90572 


prK7piotem tyrosine f(&iase7 


mAb&s.m. 


plasma membrane 


419555 


U29615 


Hs.91093 


chiflnase 1 (chttobiosldase) 


mAb&diag 


extraoelhilai* 


420610 


AI683183 


Hs39348 


di8t^4esshQmeobaK5 


CTL 


mdear 


421110 


AJ2S0717 


H&1355 


caQiepsbiE 


sm&(Sag 
dag 


extracrihrlar 


421379 


Y15221 


Hs.103982 


smaO indudUe cytokine subMIy B (Cy 


8e(<r^d 


421474 


U76362 


Hs,104637 


sohjte carrier famDy 1 (giutam^ trans 


mAb&sia 


plasma memlirane 


421552 


Ai=026692 


Hs,105700 


secreted hizzled^latBd protein 4 


(Sag 


secreted 


421753 


BE314828 


Hs.107911 


ATP-binding cassette, sub-family B (MDR/ 


mAb&sjn. 


plasma membrane 


421817 


AF146074 


H$.108660 


ATP-binding cassette, subfamily C (CFTR 


mAb&Sin. 


plasma memtjrane 


422109 


S73265 


Hs,1473 


^astrin-teleaslng pepfide 


diag 


secreted 


422158 


L10343 


Hs.112341 


protease &)hlbitor 3, sUn-derfv8d (SiOU. 


(flag 


secreted 


422282 


AF0192S 


Hs.114309 


apoBpoprati^n L 


(flag 
Sin. 


secr^ed 


422283 


AW411307 


Hs.114311 


COC45 (ceH tfviskn cycte 45, SxeiBvis 


nuciear 


422424 


A)166431 


H5^96638 


prostate dilfexenfiation factor 


diag 


extraceliidar 


422765 


AW409701 


Hs.1578 


baculovira! lAP lepedfiontaining 5 (sur 


Sin. 


(qftD|4asni 
nmlear 


422809 - 


Al^1379 


Hs.121028 


tiypofteflgJ protein FU10549 


sia 


422867 


L32137 


Ks.t584 


cartHage digomeiic matrix protein (pse 


(figg 




422956 


BE545072 


H5.12257g 


ECT2 protdn (EliQteBa] cell transfmrti 


CTLiSin. 




423634 


AW959908 


Ks.1690 


tieparirvbinding growth factor tiirufiig pr 


d|}^g 




423873 


PPW>3054 


H&1695 


matrix metdJopiddruse 12(inacR)pli8ge 


mAb&dag&sia 


secrrfed 


423981 


013686 


H5.136348 


peitosfin (OSF.2os) 


mAb&diag 


extraceHutar 


424046 


AR127866 


H5.t38202 


seiina (or qfstdne] pro^iase bddlAo 


diag 


sBcirted 


424381 


AA28S249 


HS.146329 


prafieln Idnse GM^ 


Sin. 


nudear 



\ 
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ns* 149303 


efiQstn 


SJTL 


cytoplaanic 


4243)3 




nS.i490U9 


me^nn, apna diuORinecun rec^nv, 


InADSSJIL 


|4asnB meirtirBne 


4246B7 


J05070 


ns.i9lr<30 


matrix mdaiiQprotanase 9 (gei^na^ B 


Oag 




4K247 






ntatrix metaSoprotaoiase 1 1 (stroneiysin 


mAD & otsg a sjn. 


iifeciBiefl 




U63B30 


Lie ICttTT 
ns. 10300/ 


pmeoi unase; unA-ecQvaisa, csiayDc 


SJIL 


cytoplasirnc 




NaLWl944 


Uc KMC 

ns.13/3 


(tesmog!^ 3 [pcfnphiQus vulgaris arGgen 


mAk 


plasma membrane 


429 fJ4 


AP0562)9 


n5.1*KU30 


pepSdylgtycine stpha^airdda&ig monooxyg 


sjn. 




4£9r /D 




n9m 199439 


paraihyroxl hotmone cflCflftUu 2 


ninD 9 QKjy 


ptoMIM IIRiTTUIIQliO 

ptasnQ (nembiane 






nd. 133031 


death receptor 6^ TNF supeffeniOy nmtor 


mAh A e m 






ns. 1 33tt J 


sianiDocaicui £ 


mAK A ffian 


secr^sd 


AOCAVJ 


MB6699 




TTKpiolBtn kina^ 


CTL& SJTL 


s^^d 




DCDIOOM 




bone (noffdw^neSc protein 7 (osteoQenic 








AAiU(KA9 
/W44004< 


Ue 9<;1R77 


V7 aiiQQen /□ 


OIL 


cycpiaanNO 


Avnn 




Ue IftftRCC 


^nneAhreonfate kitose 1 2 


SJTL 


cytoplasms 






Ue 


leuKenud mninuDiy uhu icnauiayic 


<&g 




^8330 


I.Z23Z4 


Ue 


mamx meiauoprweoiase f {ftianajSui, 


inAb& (Gag & SJn, 


exoaoeuuiar 


428450 


Kit J A4 ilTM 

NM.0i4791 


nS.io4339 


KIAA01 75 gene proouct 


8JIL 


nuclear 


42ll4f9 


TUU2/« 




oeu onnsnn Qfcs ^ ui lo o eno w d 


SAL 


nuclear 


il9flilU 




14a IQAfiM 

ns.io40ui 


Sonne canisrianiif r icsuoiuc ensno 


fnAh A e m 


pbsma memteane 


428ro4 


AKDUIooo 


n5.1 09095 


* - 1^ A A 114 f--J . BL— 

amiiarto SALLi (sapilnsopniaiHwB 


CTL&Sin. 


nudear 




AA0C077Q 


Ue 9^090 


iNiAA 1000 proten 


mAk 
nWD 




428748 


AWo9320o 


Ui> OflrYOC 

nS.9B783 


Ksp37 protein 


Qiag 


extrace&uiar 








CA125an6o8n( nnidn 16 


oiag 


nflKhodrisT 




AC19A97A 






(Gag 


esdraoeBuIar 


42941 1 






gap juncflon prattii^ bets 5 (oonneidn 3 


inAb&sjtL 


plasma n^mbrane 
plasma n^mbrane 


4&9ZD0 






A 1 r*oinotiQ casse^ suir Tantiiy A |Ad 


mAb& SJIL 


44Sfi)4f 


AWUUsibO 


Ue QQ^TC 


CCTe 






4Z901Q 




Ue 94inQO 


1 1 IMV nrntnbi* Dl 1 (UP htfM* ami noe 


mAk t. #Bm 


secieieQ 


429903 


Al 4%I107 


Ue 09CQT 


cydiiwl^iendenl l^ase 5. r^ulatoiy su 


SJn, 




430488 




nS^4l93l 


cniofKie cnanne^ caiQurn ataroaiea^ lani 


fnAk £ e m 

mAoa SJn. 


plasma mernDrane 


il 91 it CO 


AUUCfi^CTO 
AW9000r4 




granin-nke neuroendcxsrine pepfide precu 


dl^ 


ovtra^attirior 
cXliaMiUUldl 


431515 


NM.01Z152 


HS^5B5o3 


endolheOa) dif&ren&afion, lysophosf^ 


m^& SJTL 


plasma membrane 


431846 


Bo0i9924 


nS^IOOO 


un)}da]dn IB 


mAb& dl^ 


plsina membrane 


431958 


)w3o2S 


Hs.2877 


cadnenn 3, q^pe 1, P-cadnenn (placenla 


mAb&dlag 


plasma membrane 


432201 


AI538613 


Hs^8241 


Transmembrane pruteasa, serirs 3 


mAb & oiag & SJIL 


ptema men^irane 


433001 


AF217513 


HS^7S905 


oone nQ0310 PROujiOpi 


SJIL 


nuclear 


4355(S 


AF2u0492 


Ks21123o 


inteneuxuvi nomoiog 1 


diag 


secreted 


4^61' 


AAJ7959/ 


Hs^199 


HSrCiW protein aniitar to utxtjusin-con 


SJTL 




437016 


AllfVTCQIB 


Hs^398 


guanine monphosphale synthetase 


SJTL 


c^ftopiasm 


437044 


A I M^OCA 

Al0%o04 


n5.o9517 


dSierenSally expressed in Fanconf s an 




CD 


437789 


A|CQ4«i J 

Alaoio44 


ns.iZ7ol4 


CCT» UfaxMu MmtWlM r4799n h«mntkafl 


Ul L 


nuclear 


'iofoSu 


bciAl icuo 


Ue OCeODT 


tolo, weaiuy ssnuars) ojjo3ui£.i [rusa 


mAb& SJTL 


plasma membrane 


439223 




Ue OCAC4D 


ULi 0 wnQing proQin 2 


mAb 


plasma mendnane 


AViATJ 




Ub con/o 


CCTo lini4cmtaltf e(f«nw In f2CD9 U1 lUAM fl 

CO IS, Moaeraieiy sontar 10 brK4.nuMAN u 


mAb&SJiL 




43S0W 


W79i23 


HS.So5di 


6 prat^iHxiupled receptor 87 


mAk A c m 

mADaSJiL 


plasma menuvane 


439738 


QnACCM 

bcZ40aU2 


Ue OCBQ 


SQIBOCXnaBH UIVIHI^ 


mAk ft e m 


plasma membrarte 


440006 


AKD0{K17 


Hs.6944 


NALr2 (m}tein; PrRIN-Coniaining APAr i-fl 


6.nL 


nuclear 


441362 




nS.23044 


KAUal (o. cereviSiaej nonuicg (c cou xe 


s.nL 




442117 


AW6o49d4 


n5.1Zoo^ 


coTs; nypoinetical proton nx iMAw.44f 


mAb&SJiL 


plasma memtvane 


443Z47 


^»i43o7 


t1S>939BS9 


c-MyciaigetJrQi 


m 


exbacellular* 


AA'tA'ib 

44o42> 


ACnOfllCD 

AHSnlldtf 


Ue 094a 


duomoeonie 2Q open reading Iranie 1 


m 




443859 


NIVL0i3409 


Hs.9914 


foIHstalln 


(Sag 


cxnacenuifli 


4440D6 




KS.iOOoo 


type 1 transmembrens pratdn Ri14 


mAk 

niAO 


pBsma meniDrane 


444371 


Bc54uZ74 


Hs.239 


J. It. ■ k«A 

R)iKneaa box Ml 


SJIL 


nudear 


444381 


otMfSSs 


Um <9a9T49 


cSTs, weaxiy soniiar 10 ot>4U34 nypouteo 


Giag 


secreted 


444781 


NM_Ui44U0 


Hs.11950 


GPI-anchoied metastasis-associated pro£e 


mAk £ <l><an 


plasma fflembrBRB 


44990/ 


AJ245671 


H5.12844 


ciir^iKe-ooman, muuipQ 0 


mAb&dlag 


secfsled 


446619 


AUU7oo4J 


K5^13 


secreted phosphoproton 1 (osteopontin. 


dag 


secreted 




AdUiZi 10 


IJ«« 4ccon 
nS.lo530 


stnau inouciDie cyioKine susianuiy a 


diag 


exuaceiiuiar 


44 /Um 


AlOO/412 


Hs.1 57601 


CCTe 

boiS 


u 1 L ft OleQ 


secreted 


AA'nA'i 
44ii>44C 


AHMMfiQ 
AI1k>ZoO 


Hs.19322 


noino saproHs, aonuar 10 raiscn cujia zuiu 


m 




448243 


AW3097/ 1 


HS^2o20 


integrin, beta 6 


ITTAb& SJIL 


. k 
plasma memorane 


446844 


A1581519 


Hs.1 771 64 


ESTe 


mAb&SJTL 




449048 


Z45051 


Hs.22920 


Similar to Soo40i (canie} giucose inouc 


mAk 

mAD 


plasma membrane 


449722 


Dc2oUD74 


nS.23960 


cftSn B1 


s.nL 


cytopusm 


450001 


NMJ'u1044 


Hs.406 


soIutB canier tairily 6 (neuiolransn^ 


mAk Asm 

mAD&SJTL 


plasma membrane 


450375 


AAX9647 




a disinlegrin find metalloprotanase ddma 


nAk A jRm a ■• m> 

mAD & diag & SJIL 


plasma Rienibrane 


^0701 


n399o0 


HS.28d457 


Iq^oineoca protein ArjBSoioi (leucine- 


mAk ft «fftM 

mAD&oiag 


plasma menrivane 


450983 


AA305384 


HS75740 


ER01 {S. cerevislae)-Eke 


(Sag 


secreted 


451668 


Z43948 


H5^26444 


cartilage flci^Pc proton 1 


mAb&dlag 


plasma menibiane 


452281 


TB3500 


Hs.28792 


Homo sapiens cONA FU11041 fe. done PL 


diag 




452401 


NiyL007115 


H8.29352 


tumor neoDste factor, a^haMuced pro 


dag 


extracdhilar 


452747 


BE153855 


Hs.61460 


ig supertarrBiy lecepur lnik 


mAk 

nVuj 


plasma membrane 


452838 


U65011 


Hs^43 


preterentiadly expressed enflgen bi mda 


UIL 


rrudear 


4539^ 


AA647843 


Hs.62711 


KE^ mobffily gnitp(noidi!stoRechnvnoso 


CTL & SJIL 




457489 


Ai8938l5 


Hs.127179 


oypScgene 


diag 


secreted 


TABLE 14B 










Pkey. 


Unique Eos probasetiden 


fflernuniber 






CAT mm 


iber Gene duster number 








AcoesskH 


IK Genbenfceocessfonnuinteis 







Pkey CATNundier Acoes^on 
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4148fi3 1S024.1 AA928960 AA9269S9 W76521 W24Z70 W21S26AA037172 BE2S7638 H83186AM68909 NB6396AA001348 BE535738 AA081745 BE55624S 

AA082436 H72SS H77575 N4g786 VV80S55 Hm46 BE5G9085 VVD4339 R98127 T55938 BEZ^ 
AA292753 M177048 NM-X1826 X54d41 BE314366 AA908m AI719075 BE27(n72 8E^19 AAB89955 A!^^ 
AAS72039 W72395 T99630 Al42^1 H98460 N31428 BE2S5916 m2B5 AIB57576 AA77^ AA91 0644 AA453522 AA2S3140 AW5U687 
R75953 AW662398 AA662522 AI865147 Al^53 AW262230 AA5B4410 AA5a3187 AWQ24595 AW063734 AI828996 AA28^ AA876046 
AW613002AA527373AW972459AIB31380AAS21337AA100926AA772418AA594628AI033892 
N95210AI459432A1041437AA932124AAB27684AA935829AI0048Z7AI423513AI094597H42079R^ 
AA643260 W44S1 A{991988 mmZ mOSBl AA740817 AI312104 At91 1822 AA4 W1 AI185409 AA1 29784 AA701623 AI075239 
A)139549 AA633848 AI339938 AI336^ AA3S9239 A107670B AI085351 AI3S2835 AI348618 AI146955 AI989380 AI348243 N92B92 AA765850 
A}494230 AI278687 AA96^ AM92600 W80435 AA001 979 f^424 Al 1 2901 5 fC41 27 AA1 57451 AA23S49 AA4^^ 
/^21 1 AWOS9S01 AW888710 R92790 N59755 At36112B AWS69407 H4772S H97534 H48076 H48450 T93631 AW30075B K03431 R767B9 
AA954d44 H7^ RS8823 AI457100 N92845 N49682 H42038 BE220698 6E220715 K99552 AA701 624 N74173 R54704 K79S20 H72323 
H03266BE261919AA769633AA480310AA507454AA910586AI203723AW10472W25611W2S071 T88980 H03S13T77589R99156 
WS095 R97470AA702275T77551 AA911952 K82956 N83673 AA283872 
450375 833Z7J AA009647AA131254AA3742S3AVV95440SH04410AVV606284AA151166BE157467BE157a^ K04384 W46291 AW863874H04021 K01532 

AA190993 H03231HS9805H01642AA852876AA113758AA62691SAA746952AI161014AA099^R69067 



TABLE 14C 

Ptey: Ui^qud number corresponding to an Eos probed 

Ret Sequence souice. T]i8 7digltmimbei5inlh!sooluninareGenbankidenGfier(GI}ra "Dunham I. etaL* refers to fin pubGcaSonentiited The ONA 

sequence of human chromosofne 22.* Dunham 1. el ai. Nature (1999) 402:48949Sl 

Srand: Indicates DMA strand from wtiichanns were predided. 

Ntj)o^6on: Indicates nucteoSde posSkxis of predictsd exons. 

Ptey Raf Strand t^Lposifion 

402075 6117407 Phis 121907-122035,122804-122921.124019.124161.124455-124610.125672-126076 
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TABL£15A: UAvrraSonbraDGequenoeshTAie 

TaUelSAshoffsOisSeqID No. F1(0y,ExAa»,UnigenelD. and Ur^eneT^ for afl of fhe sequent 

TaUelSBshovfheaxessbn numbers for those Pke/sladd^ For each prabeset we have BstedQie gene duster number 

d^tjniEteoMes were designed Geiie dusters were compflsd ising sequences derived irom Genbank ESTs arei mRNAs, These sequences ware dffitered ba sed on sequence 
Parity using Chistering and Alignment Tods (DoubleTmst. OaUand CaEfomia}. The Genbank accession nimibeis for sequences comprising each cluster are listed In the 
'Acces sion * cdumn. 

Table 15C show ttegenomteposffiot^ for (hose Pke/sbddngUnigeneHrs and a For eadt predicted exon, we have fisted the genomic 

setiuence source ustti for pfodlciion. NudeoSde loca&mscf each pcedidBd axon are also Ested. 

SeqIONo: Sentence (D number 

Ftey: Un^ue Eos prdbesetMenfiSer number 

ExAccn Exemplar Accession nunter, G&rdtank access i o n nuniber 

UngenelO: Unigene number 

Uii^ne TlSsUn^ gene Gila 



SeqtDNo: 

SeqIDNo:1&2 

SeqIDNa3&4 

SeqlDNo:5&6 

SeqlDNa7&8 

SeqlONo:9&10 

Sa)iOKta11&12 

SeqlDNo:13&14 

SeqlONo:15&16 

SeqlONo:17&18 

S£qlDNo:ig&20 

Seq)DNo:21&22 

SeqlONo:23&24 

SBqlDNo:2S&2S 

SeqlDNo:27&28 

SeqIDKo:29&30 

SeqiDHo:31&32 

SeqlDNo:33&34 

SeqlONo:35&38 

SeqIONo:37&38 

SeqIONo:39&40 

Seq!ONo:41&42 

SeqlDNo;43&44 
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ESTs, Weakly slmOar to 21 0928QA B oeD 
pap&lytglyGiro dpha-an^dafing monooigfg 
hypottttficd protein M6CS350 
ESTs 

unnamed praldn pnxbict (HoRD sapiens] 

» » ' ■iiii^nTamnnn rinniilniit JC 

nvacniDinosGnie nonuBnance aeooenifo. 
v-eis erythrobiastosk virus E26 oncogen 
ESTs, Weakly dmilar to S41044 duomosom 
guadne nudeoGde binding prtHsin 11 
cddtonin receptor-fike 
cadheiin 5. type 2, VE-cadheiin (vascuta 
^ed (Dro9DphHa}-IIte (sea urchin fas 
Homo sapiens HSPC285 mRNA, partial ods 
oon^riemenl coniponent Clq receptor 
ESTs 

protease inhibiter 3b skfiHderived (SKAL 
piakoph!En3 

RAN, ntembef RAS onooQene fandly 
paraOiyroid honnone-Iike hornx>ne 
lev density l^xjprotein receptoHelated 
eiKtogenous retroviral protease 
ooSagen, type )Q, a!pt\a 1 
SRY (sex detem^ng re^on Y)-box 4 
guanine monphosphate synthetase 
l^tufiary tunxjr'traisfbfming 1 
insulin^ growth factor binding prote 
SIQ7 (suppressor of RNA poIjiTnarase B. ye 
bU^rate4nduced trarscripA 1 
butyrate-lnduoed benscript 1 
swsSi proiine-dch protein 16 (oondfin) 
H2A Nstone family, member X 
gbiHomo sap^ns lull length insert cONA 
^ycoprotein (transmemtirsne} nnd) 
atpha-fetopfotein 

intsgrin, a^a 5 (fitffonec&i receptor, 
noabix metaOopiuteinase 1 0 (stromelysin 
mablx niBtaHoproteinase 1 (interstitial 
matrix metalkqvoteinase 1 (interstitial 
solute earner family 7, (cafionic amino 
tissue factor pafhwsQf kihibOor 2 
Gprotdn-coupled receptor 39 
periosSn (OSF-2os) 
monokine Induced by gamma tntederon 
5T4 oncotetat trophofalast ^yooprotein 
5T4 oncofetd liophofala^ glycc^irotein 
carfHage oBgomeri c matrbc protein (pse 
small induct cytoldne subfamily B (Cy 
ESTs. W^y sknllar to 864054 hypoDieU 
UV-I protein, estrogen regulated 
AdScan 

KIAA1866 proton 
hypothetical proton FIJ21 080 
secreteo inzziea^eiawi protem 4 
Ig supeffandy receptor LNtR 
adlslnlagitn aid metaQopratinase dotna 
starailocaidn 2 

notrixmelaUoprotetnase 11 (stromelysin 
Transmembrane protease, serine 3 
ooSagen, type X, alpha 1 (SdunklRetaph 
ESTs: hypotheScal protein for IMAGE:447 
gsp junction pnt^ lieta 2, 26kO (conn 
ESTs 
ESTs 
ESTs 

c-Myc target JP01 
transmemlvane protease, serine 4 
Hypoihefical fffotein, XP.051 860 (KIAA1 1 9 
Hypottefc d protein. XP.051860 (iOAAl 19 
transcripSon tector 
bone marjdxjgeneSc pot^ 2 
fibrilBn 2 (oongenltrf oonbadind ara 
cyst^ knot supofairi^ 1, BMP antagon 
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SeqlONK462 & 463 4378S2 

SeqIONo:464 & 455 402075 

Seq(DNo:466&4S7 421110 

Seq(ONo:468 & 469 45166B 

Seq[DNo:470& 471 451668 

S9qIDKo:472&4n 45166S 

SeqIONo:474 & 475 422282 

SeqtDKo:476 & 477 425852 

SeqIDKo:478 & 479 439738 

S8qa)No:480 &481 427747 

S8qtDNo:482 & 483 420281 

SeqIONo:484 & 465 405932 

SeqIDNtx486&4a7 405932 

SeqtONo:488 & 469 444342 

SeqiONa490 & 491 421379 

Seq(DNa492& 493 417079 

SeqlDNa494& 495 430890 

SeqlDNo:496&4g7 419721 

SeqIDNo:498 &499 444471 

SeqlONo:500& 501 413063 

SeqlDNo:5Q2& 503 433800 

SeqlDNo:S04&505 452401 

Seq)DNo:S06&507 452401 

SeqIDNo:508 &509 450001 

SeqIDNcrSIO&SII 410407 

SeqlDNo:512&513 309931 

S8qiONo:514& 515 412719 

SeqlONo:516&517 417034 

SeqIDNo:518& 519 430486 

SeqlDNo:S20&521 413753 

Seq(DNQ:522 & 523 425550 

SeqlDNo:524 & 525 423873 

SeqlDNo:526 & 527 416663 

SeqlDNo:528& 529 416863 

SeqlONa530 & 531 429610 

SeqlONo:532 &533 406690 

8eqlDNo:534 & 535 431646 

SeqlDNo:536 & 537 422158 

SeqlDNo:538&S39 431958 

GeqlDNo:540 & 541 437044 

SeqlDNo:542ftS43 428484 

SeqlDNo:644&545 429211 

Seq[DNo:546 & 547 417389 

SeqIDNo:548& 549 431009 

SeqlDNoc5S0& 551 417542 

SeqlONo:552 & 553 449230 

Seqn)N(K554 & 555 410555 

Seq!DNo:^&557 410555 

SeqDNo:558&5a 424667 

SeqtONo:580& 561 418462 

SeqtDNo:562 & 563 410274 

SeqtDNa564& 565 439606 

SeqlDNo:566& 567 404877 

SeqIDNo:568 & 569 444781 

SeqlONo:570& 571 418543 

Seqn)Ko:S72&573 415817 

Seq!DNo:574&575 415817 

SeqIDNo:576 & 577 415817 

SeqlONR578& 579 41S17 

SeqlDNo:580 &5ei 41^17 

8eqlONaS82&583 415817 

SeqlDNa584&»S 421817 

SeqlONo:586 &587 418S78 

SoqtDNo:5B8& 569 418678 

SeqlDNo:S90& 591 409420 

SeqlONo:592 & 593 332180 

SeqlDNo:594&S95 408790 

Seq1DNo:5968tS97 408790 

SeqtDNo:598 &599 439223 

SeqlONo:600 &601 409^ 

SeqlONo:602 & 603 428959 

SeqlONo:G04&605 4269G9 

Seq ID No: 606 8; 607 428^ 

SeqlDNo:608 & 609 428969 

Seq[DNo:610&611 450701 

SeqlDNo:612&613 450701 

Seq ID No: 614 & 615 414774 

Seq ID Na 616 & 617 407944 

Seq ID No: 618 & 619 407S44 

SeqlDNo:620& 621 457489 

Seq[DNc622 &623 4^7 

Seq ID No: 624 & 625 407242 

Seq ID No: 626 & 627 407242 

S8qIDNB628 & 629 407242 

SeqPNoc630 & 631 444006 



BB)01838 




AJ250717 


Hs.1355 


Z43948 


Hs.326444 


Z43948 




Z4394B 


Hs.326444 


AF019225 


nS.1 l40U9 


AKDu1504 


Hs.1 59551 


Bk24D5u2 


Hs.9593 


AW4u425 


Lis ISfUSCC 


Al6Z9o93 


Lk ViftAtiA 

rl5JZ9494 


NfyL014398 


KS.1(W7 


Y15221 




U65590 


KSi£1134 


X54232 


Ks.2699 


NMj0016SO 


H128BSSD 


ABQ20684 


^11217 


ALD35737 


H5.75184 


AI034361 


nS.145l3li 


NM.007115 


H5.29352 


NM.00711O 


Hs.29352 


NIUX;1u44 


ns.4UD 


X66839 


fte.63^7 


AVv341683 




AVvuloolO 


Hs.816 


nM_0u6183 


Hs3u852 


BE062109 


HSJ41551 


U17760 


Hs.75517 


NM-001944 


H3.1925 


BE003054 


Hs.1695 


AKD01100 


Hs.41690 


AK001100 


Ks.41690 


AB024937 


Hs^1092 


M29540 


Hs.220529 


BE01^24 


Hs.271560 


LI 0343 


Hs.112341 


X63629 


HS.2B77 


AU135864 


Hs.69517 


AF104032 


HS.184&01 


AP052693 


K8.1 98249 


BE2609S4 


Hs.82045 


BE149762 


K5.4895D 


J04129 


H&B2269 


BE613348 


H8.211579 


U92549 


Urn. J04 A 


U92649 


Hs^ll 


J05070 


Hs.151738 


BE001596 


Hs.85266 


AA381807 


Hs.61762 


W79123 


Hs.58561 


NM_014400 


HS.119S0 


NMJXJ5329 


Ksx59o2 


U88957 


nS.76ra7 


U88967 


LI* 7(H}C7 

nS.7Mo7 


U88967 


ns.7B887 


U88967 


hb.78667 


U88967 


Hs.78867 


U88967 


Hs.78867 


AF146074 


Hs.1 08660 


tA4-001327 


HS.1S7379 


NM.001327 


Hs.167379 


Z15008 


Hs.54451 


AF134160 


Hs.7327 


AVV580227 


Hs.47860 


AW5B0227 


Hs.47880 


AW238299 


Ks.250618 


NM_001898 


KS.123114 


AF120274 


HS.1946B9 


AF120274 


H5.194689 


AF120274 


Hs.1 94689 


AF120274 


Hs.194oB9 


H3^0 


Ks.2Bo4o7 


K39960 


HS.2Bo4o7 


X02419 


Hs.77274 


R34008 


lit. mccm 


R34008 




A1693815 


Hs,127179 




Hs.99376 


(yll8728 




\mm 




M18728 




BE395085 


Hs.10086 



ESTs,VfeaidyaniIartodJ365012.1 pisa 
ENSP0000OK1O56*:Plasma membiane caldum 

caQiepsinE 

caifilage acidic protein 1 
caitDage ackfo protein 1 
carfnsQB ecidlc pfOteiD 1 
apdlpopRitdn L 

(tea&i receptor 6b TNFsupofamSyinen^nr 
sema donoin, immunoglobufin donoin (Ig)] 
serine/Uveorane lonase 12 
ftedidedcsfoo efSux pump 
C15000305:^)36061221gb}AAC8919aLl) (AFO 
C15000^:gq3806122Igt}lAAC69ig8Ll) (AFO 
sonflartolysoscin&xgsodatedniembrane 
srnaD Mucibte cytoltine subfamSy B (Cy 
biterleukln 1 rec^Av antagonist 
glypican 1 
aqiiapGrin4 
K1AA0877 proton 

chiSnase 3#e 1 (carnage glyooprote 
lung type-l ceO membrane-associated gty 
tunw necrosis factor, a^ihainduced pro 
tumor necrosis factor, a^fdia^Muced pro 
solute carrier fairfiy 6 (neuiDtransmRlB 
cartwrfc ant^fdrase iX 

gbiid13d01j(1 Soare5jJFU.T„GBC.S1 Homos 
ESTs 

neurcAemdi 

diloride channel, calcium acSvated, fiam 
taminin. beta 3 (neon (1 25kD), kaimin 
desmogleln 3 (pemphigus vulgaris anfigen 
rratrix metaOoproteinase 12 (macrophage 
desmocoIGn 3 
desmocdI!n3 

UJNX protein; PUUNC(p&flala lung and nas 
cairinoemtjiyonlc anSgen-ielated oeH ad 
uroplaUnlB 

protease inhibitor 3, sidivderived (SKAL 
cadherin 3, type 1. P-cadherin (placenta 
dlflerenQaily expressed in Fanconi^ an 
solute carrier family 7 (catioric amino 
gap JuncSon protein, beta 5 (connexin 3 
midkine (neurite grcwfh^Komoting factor 
gap Juncfon protein, tieta 6 (connexin 3 
pn^estagen-Bssodated endometrial pniB 
melanoma cd adisslon motectde 
a(fisintegrin and metaUoprot^nase doma 
a disnntegrtn and rnetaDoprot^riase doma 
matrix metafloprotebtase 9 (gelaSnase 6 
int^rin.beia4 
hypooda'indudble protfiin 2 * 
G protein-ooupied receptor 67 
NM.005365M)mo sapiens melanoma anSgen, 
GPI^anchored metasfasis-assoc i a te d prots 
hyaturonan synfliase 3 
protein tyrosine pho^iMase, (eoeplor>t 
protein tyrosine phosphatase, receptor-t 
protein ^^rosine phosphatase, reoeptor4 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
ATP-tinding cassette, sub-Mly C (CFTR 
cancerAesfis antigen (NY-ESO-1) 
cancerfte s lls anfigen (NY-ES0>1) 
lannMn, gamma 2 (nicen (lOOkD), kaiiri 
cbudinl 

neuotropMc tyrosne knase, leceplor, 

neuotraphic ^rroslne kinase, leoeplor, 

IIL16 bvnfir^ prot^ 2 

cystallnSN 

artemln 

arteirin 

ariemin 

aTtemin 

hypotMd protein XP.098151 (leucine- 

hypoiieGcal protein XPJ098151 (leudne- 

plaarcnogen acGvdar. urokinase 

desmocollin2 

desmocQlDn2 

crypficgene 

ESTs 

gb:Human nonspedSc crossreacfing anfig 
gbiluman nonspecSiccrossieacGng anfig 
gteHuman nonspecific crassreacBiQ anEIg 
type I transmembrane protein Fn14 
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Seq[DNo:632 &633 
SeqlDNtt634&63S 
Seq[ONo:636 &637 
Seq[DKo:633 &639 
SeqIDNo:640&641 
SeqlDNo:642&643 
SeqlDNo:644 &645 
SeqIDNo:646 &647 
SeqlDNo:64B&649 
SeqlONo:^&651 
$eqlONa652&653 
SeqIDNo:654 &655 
SeqlONo:655 &657 
SeqiDN(r658 &659 
SeqlONo:^&661 
SeqDKa662&663 
SeqlONo:6&4&665 
SeqlDNo:G66&667 



PCT/US02/12476 



4298SI 


NMJDOjBIq 


Kb.2442 


a di^ntegrni snd inetslbproteinsse doiB 


422109 


S73265 


Hs.1473 


gastriiH^eaoEiQ pcp&te 


419235 


AW470411 


Hs.28o433 


nsurotrifrin 


443048 


Z45051 


Ks.22920 


mirwJU-ir CCOJIM iMUleA i|1|iiij-ir n t«M4lf*» 

stmssr Q> Soo40i (catue; giuoos6 inGus 


419216 


AUQ767i8 


KS.1 64021 


smsii nauctoi8cyionne suDQiTDiy b 


431462 


AWSBSoTZ 


K5.2S6311 


granin-Cko neuRHflndooins pspOds )Sbcu 


448Z43 


AVIU09771 


Ks.52620 


iiuegnn, 0613 0 


426427 


M86699 


rts.1 69340 


TTK protein kinase 


445537 


AJ24S671 


Hs.12844 


tbr^iKe-ooinain, nuflapte o 


422278 


AF072B73 


Hs.1 14218 


Dtzzso (urosopnuaj nomnoQ d 


428450 


NM_0i4791 


nS.184339 


1/1* AA4Ye ._<i_>^ «i. » Ji 1.1 

KlAAOlTogeno produa 


446619 




Hs^3 


secreted pho^rix^tfol^ 1 (ostei^ionQn, 


453392 


U23752 




SRY (sex detefmintng region Y}>box 11 


426514 


DCOI0033 


nS.i7U199 


bore morphoganetic protein 7 (osteogenic 


^5776 


U25126 




para&tynud honnxte receptor 2 


425776 


1 lift io 

U25i2B 


H&l^ro 


paraOiyroid hoimone receptor 2 


431515 


NNU)12l52 


nS.25ooB3 


endoOtelial dSterenfiation, lysophospha 


419452 


U33835 


Hs.9(^ 


PTK7 protein 4^10^8 Unase 7 


432653 


K6203S 


HS.2931K 


ESTs, WeaMy ssrsiar to JC73ZS encno esi 


432653 


No2D95 




ESTs, WesMy sinto to JC7328 amino ad 


432653 


N62096 




ESTS, Weaoy similar to JCToZo anvno aa 




IMdcUtD 


lie OQOIAC 


boiSt vvesxiysinniariO Jwi^ arniRoaci 


410001 


AB041036 


H8.57771 


kaHikreinll 


426501 


AWD43782 


Hs.293616 


ESTs 


408369 


R38438 


Hs.182575 


solute earner family 15 (H??? transport 


445413 


AA151342 


Hs.12677 


00-147 pmtebi 


422424 


A1186431 


Hs.2g6638 


prostete difterentialion factor 


428330 


L22S24 


Hs.2258 


' » J-M * « »«1 -»- 

mstnx inetsiKQiroBanasB j (mamiysm, 


420610 


A16B3183 


Hs.99348 


(fisteMess homeo box 5 



SeqlDNo:670 &671 
SeqtONo:672 &67a 
SeqtONo:674 &675 
SeqlONo:676&677 
SeqU)No:678&679 
SeqlDNo:680 &681 
SeqlDNa6B2&683 
SeqiONo:684 &665 
SeqlONo: 
SeqIONo: 



TABLE 158 

PiG^ Unique Eos probesdidenGSei number 
CAT number Gene cfuster number 
Accessioni Genbank eccesson numbers 



Pkey 
309931 
330493 
439285 



CAT Number Acoessioi 
AW341683 
33264j5 
47085.1 



M27826 R76416AA307645AW957879 AW957800 AA633529 K03562 

AL133916 N79113 AF086101 N76721 AW950828 AA364013 AW9S5684A1346341 AI867454 N54784A1655270A)421 279 AWD1 4882 
AA775552 N62351 N59253AA626243Al341407eE175639AA458968AI3S8918AA457077 
450375 83327J AA009647AA131254AA374293AW954405 H04410AW606284AA151166BE157467BE157601 H04384 W46291 AV^ H01532 

AA190993 K03231 H59605 H01642 AA852876 AA1137SB AA628915 AA746952 AI161014 AA099S54 R59067 
451320 86576J AW 18072 AI631982T1 5734 AA2241 95 AI701458W20198F26326AA890570 N90552AW071907A^^^ 
A!124086AA224388AI084316AI354686T33652A1140719AI720211T03490AI372837T15415AV\K^ 
AA017131 AA443303 T33823 AI2225S6 T33511 T337B5 At419606 055612 



TABLE 15C 



Pkey: Unique number corresponding to an Eos probesel 

Ref: Sequence source. The 7 digS numbers in this column are Genbanklden1ifi8r(GI)nunib8is. Dunham L el al/refere to fliepiibBcaitonenGIled The DMA 

sequence of human chromosome 22.' Dunham L el ^, Nature (1999) 402:489-49& 
Strand: Indicates ONA strand bom which axons were predicted. 
Ntjioslfion: buficates nudeofide posffions of pretficted exons. 



Pkey 

402075 

403329 

403478 

404440 

404877 

405770 

40S932 



Ref 

8117407 
8516120 
9956258 
7528051 
1519284 
2735037 
7767812 



Strand 

Rus 

Plus 

Rus 

Rus 

Rus 

Plus 



I^Lpodlion 

121907.122035,122804-122921.124019-124161.124455.124610,125672-126076 

96450-96598 

116458-116564 

80430^1581 

1095-2107 

61057-62075 

123525-123713 
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Table 16 

Seq ID NO: 1 DMA sequence 

Nucleic Acid Accession #: NM_001216 

Coding sequezure; 43.. 1422 

1 11 21 31 41 51 

I I I I t I 

GCCOSTACAC ACC3GTGTGCT GGGACACCCC ACAGTCAGCC GCATGGCTCC CCTGTGCCCC 60 

AGCCCCTGGC TOCCTCTGTT GATOCOGGCC CCTGCTCCAG GCCTCACTGT GCAACTGCTG 120 

CTGTCACT6C T6CTTCTGAT GCCIGTCCAT CGCCA6AQGT TGCCC066AT 6CAG6A0GAT 180 

TGOCCCTTGG 6AGGAG6CTC TTCTGGGGAA GATGACOQVC T6GG0GAGGA GGATCTGCCC 240 

AGT6AAGAG6 ATTCACCCA6 AGAGGAGGAT OCACCCGGAG AGGAGGATCT ACCTGGAGAG 300 

6AGGATCTAC CTGGAGAGGA GGATCTACCT GAAGTTAAGC CTAAATCAGA AGAAGAGGGC 360 

TCCCTGAAGT TAGAGGATCT ACCXACTGTT GAGGCTCCTG GAGATCCTCA AGAACCCCAG 420 

AATAAT6CCC ACAGGGACAA AGAAGGGGAT GACCA6AGTC ATTG6C3GCTA TGGAG6GGAC 480 

COGCCXrPGGC CC0GGGTX3TC CCCAGCCTGC GOGGGCOSCT TCCAGTCCOC GGTQ6ATATC 540 

OGCCCCCAGC TCGCCGCCTT CTGCCOGGOC CTGOGCOCCC TGGAACTCCT GGGCTTCCAG 600 

CTCCCGCCGC TCCCAGAACT QOGCCTGCGC AACAATGGCC ACAGTGTGCA ACTGACXXTPG 660 

CCTCCTGGGC TAGAGATGGC TCTGGGTCCC GGGOGGGAGT ACCGGGCTCT GCAGCPGCAT 720 

CTGCACTGGG OGGCTGCAGG TCX3TCCGGGC TOGGAGCACA CTX3TGGAAGG CCAOOGTTTC 780 

CCTGCOGAGA TCCACGTGGT TCACCTCAGC ACOGCCTTTG CCAGAGTTGA CGAGGCXnTC 840 

GGGOSCCCGG GAGGCCTGGC CGTGTTGGCC GCCTTTCTGG AGGAGGGCCC GGAAGAAAAC 900 

AGTGCCTATG AGCAGTTGCT GTCTOGCTTG GAAGAAATCG CTGAGGAAGG CTCAGAOACT 960 

CAGGTCCCAG GACTGGACAT ATCTGCSVCTC CTGCCCTCTG ACTTCAGCOS CTACTTCCAA 1020 

TATGAGGGGT CTCTGACTAC ACOGOCCTGT 6CCCAGGGTG TCATCTCGAC TGTXSTTTAAC 1080 

CAGACAGTGA TOCTGAGTGC TAAGCAGCTC CACACCCTCT CTGACACCCT GTGGGGAOCT 1140 

GGTQACTCTC G6CTACAGCT GAACTTC0S2V GOGACGCAGC CTTTGAATGG GCQAGTGATT 1200 

GAG6CCTCCT TCCCTSCT G G AGTGGACAGC AGTCCTOSGG CTGCTGaGCC AGTCCAGCTQ 1260 

AATTCCTGCC TGQCTGCTGG TGACATCCTA GCCCTGGTTT TTGGCCTCCT TTrrGCTGTC 1320 

ACCAGCGTCG OGTTCCTTGT GCAGATGAGA AGGCAGCACA GAAGGGGAAC CAAAGGGGGT 1380 

GTGAGCTACC GCCCAGCA6A QGTAGCCGAQ ACTGGAGCCT AGAGGC7GGA TCTTGGAGAA 1440 

TGTQAGAAGC CAQGCAGAGG CATCIGAGOG G6A6C0QGTA ACTGTCCTGT OCTGCTCATT 1500 
ATOCCACTTC CTTTTAACTG CCAAGAAATT TTTTAAAATA AATATITATA AT 

Seq ID NO: 2 Protein sequence t 
Protein Accession #: NP_001207 

1 11 21 31 41 51 

111)11 

MAPLCPSPWl, PUiIPAPAPG tiTVQLLLSLL LLMPVHPQRL PRKQEDSPLG GGSSGEDDPL 60 

GEEDLP8BED SPHEEDPPGS EDLP6EEDLP 6EEDLPEVKP KSEEEGSLKL B)LPTVEAPG 120 

OFQBPQIQZAH BDKE6DDQSH HRlfG(3>PFHP 8VSPACAGRF QSPVDIRPQL AAFCPALRPIi 180 

SLXiGFOItPPIi PEXiRLSlOnSI SVQLTLPPGL EMAXjGPGREY RALQLELBUG AAGRPGSEBT 240 

VEGHRFPABI EWBLSTAFA &VDEALGRPG GLAVLAAFZiE EGPEENSAYB QLLSRLEBIA 300 

EEGSETQVPG LDISALLPSD PSRYFQYBGS LTTPPCAQGV IHTVFNQTVM LSAKQLHTLS 360 

DTLHGP(3)SR LQUfFRATQP LNGRVIEASF PAGVDSSPRA AEPVQLNSCL AAGDILALVF 420 
GLLFAVTSVA PLVQMBRQHR KGTKG8VSy& PAEVABTGA 

Seq ZD KO: 3 DNA sequence 

Nucleic Acid Accession it BC013923 

Coding sequence: 438-1391 

1 11 21 31 41 51 

I 11 I I I 

AGOGCGGTTO TCTATTAACT TGTTCAAAAA GTATCAGGAG TTGTCAAGGC ACAGAAGAGA 60 

GT6TTTGCAA AAGGGGGAAA GTAGTTTGCT 6CCTCTTTAA GACTAGGACT GAGAGAAAGft 120 

AGA6GAGAGA GAAftGAAAGG GAGAGAAGTT TGA6C00CA6 GCTTAA6CCT TICCAAAAAA 180 

TAATAATAAC AATCATOGGC GGC6GCA0GA TCGGCCA6AG GA6GAS66AA GCGCTTTTTT 240 

TGATCCTGAT TCCAGTTTGC Cl'Cl' C i't.TiT TTTTCCCCCA AATTATTCTT OGOCTGATTT 300 

TCCTOGCGGA GCCCTGCGCT CCCGACACCC CCGCCOGCCT CCCCTCCTCC TCTCCCCCCG 360 

CCC6QG6GCC CCCCAAAGTC CCGGCCGGGC CGAG6GTCGG OGGCQGCCGG 0QGGC0GG6C 420 

CC006CACAG CXSCCCGCATS TACAACATGA TSGAGAOSGA GCTGAAGCOS COSOaOOOGC 460 

AGCAAACTTC G66GGG0G6C 06CG6CAACT CCACC8Q66C GGCGGC0G6C G6CAACCAGA 540 

AAAACAQCCC GGACCX5CGTC AAGCGGCCCA TGAATGCCTT CATGGTGTGG TCCOGOGGGC 600 

AGOGGOGCAA GATGGCCCAG GAGAACCCCA AGATGCACAA CTCGGAGATC AGCAAGCGCC 660 

TGGGC6C0GA GTGGAAACTT TTGTC6GAGA CGGA6AAGGG GCCOTTCATC GACQAGGCTA 720 

AGOGGCTOCS AGC6CIGCAC ATSAAGGAOC AOCOSGATTA TAAAIAOOSG O0C0G6C6GA 780 

AAACCJVA6AC 6CTCATGAA6 AA6GATAAGT ACAOSCTGCC C66CGGGCTG CTGGCCCCCG 840 

GCGGCAATAQ CATGGOQAOC GGGGTCGG60 TGGGOGCOGG CCTGGGCG06 GGOGTGAACC 900 

AG06CATGGA CAGTTAOS06 CACATGAACG GCTGGAGCAA CGGCAGCTAC AGCATGATGC 960 

AGGACCAGCT GGGCTACCC6 CAGCACCCGG GCCTCAJVTGC GCA0GGC6CA GOQCAGATGC 1020 

AGCCa^IGCA GCQCTAOGAC GTGAG06CCC TGCAOTACAA CTCCATGACC AGCTOQCAGA 1080 

CCTAOVTGAA OGGCT0600C ACCTACAGCA TGTCCTACTC GCAGCAG66C A0CCCT6GCA 1140 

TGGCTCTTGG CTCCATGGGT TC6GTQGTCA A6TCC6AGGC CAGCTOCAGC CCCCCTGTGG 1200 

TTAOCTCTTC CTCCCACTCC AOOOOOCCCT GCCAGGCCGG GGACCTCCGG GACATGATCA 1260 

GCATGTATCT CCCCQGOGCC GAGQT600GG AACCCGCCGC CCCCAGCAGA CTTCACATGT 1320 

CCCAGCACTA CCA6A60GGC CC6GTOCC0G GCACQGCCAT TAACQGCACA CTGCCCCTCT 1380 

CACACATGTG AGGGCOGGAC AGG6AACTG6 AGGGGGGAGA AATTTTCAAA GAAAAAOGAG 1440 

GGAAATGGGA GGGGTGCAAA AGAGGAQAGT AAGAAACAGC ATGGAGAAAA CCCGGTACGC 1500 

TCAAAAAAAA AAAAAAAAAA AAAATCCCAT CACCCACA6C AAATGACAGC TGCAAAAGAG 1560 

AACAOCAATC CCATCCACAC TCACGCAAAA ACOGCGATGC C6ACAA6AAA ACTTTTAT6A 1620 

GAGA6ATCCT GGACTTCTTT TKGGG6GACT ATTTTTGTAC AGAGAAAAOC TQGGGAGOQT 1680 

GG6GAGGG0G GGGGAATGGA CCTTGTATAG ATCTGGAGGA AAGAAAGCTA CQAAAAACTT 1740 

TTTAAAAGTT CTAGTGGTAC GGTAGGAGCT TTOCAGGAAG TTTGCAAAAG TCTTTACCAA 1800 

TAATATITAG AGCTAGTCTC CAAGOSAOGA AAAAAATGTT TTAATATTTG CAAGCAACTT 1860 

TTGTACAGTA TTIATCQAGA TAAACATGGC AATCAAAATG TCCATTGTTT ATAAGCTGAG 1920 
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AATTT6CCAA TATTTTTCAA GGAGAGGCTT CTTCCTSMT TTTGATTCTS GAGCTGAMVT 1980 

TTAGGACAGT TGCAftAOGTG AAAAGAAGAA AATTATTCAA ATTTQGACAT TTTAATTGTT 2040 

TAAAAATTGT ACAAAAGGAA AAAATTAGAA TAAGTACTG6 C3GAACCATCT Cr G TGG T CTT 2100 

<?rrTAAAAAG GGCAAAAGTT TTAGACTGTA CTAAATTTTA TAACTTACTG TTAAAAGCAA 2160 

AAAT6GCCAT GCAGGTTGAC ACCXTrTGGTA ATTTATAATA GCTTTTGTTC 6ATCCCAACT 2220 

nCCATTTTG TTCASATAAA AAAAftCXATG AAATTACT6T GTTTQAAATA TTTTCTTATG 2380 

GTTTGTAATA TTTCTGTAAA TTTATTGTGA TATTTTAA60 TTTTCCCCCC TTTATTTTCC 2340 

GTAGTTGTAT TTTAAAAGAT TCGGCTCTGT ATTAnTQAA TCAGTCTCCC QWSAATCCAT 2400 

GTATATATTT GAACTAATAT CATCCTTATA ACAGGTACAT TTTCAACTTA AGrrTTTACT 2460 

CCATTATGCA CASTTTGAGA TAAATAAATT TTTCAAATAT GGACACT6AA AAAAAAAAAA 2520 

AAAAAAACAA AACAAAAAAA CAAAAAACAA AAACAGAAAA AACAAAAAAA AAAACAAAAC 2580 

CACftACACAA AAACAAAAAA AAAAAAAAGA AAChAACACA CAACACAACA GAACACAAAA 2640 

cx:acaacaca aacaacaaca cacagaggg 

Seq ID NO: 4 Proteia sequence: 
Protein Accession ft:CAA83435.1 

1 11 21 31 41 51 

1 I ! i I I 

KYNmSTELK PPGFQQTSGG GGGNSTAAAA GGKQKNSPDR VKRPKKAFMV WSRGQRRKMA 60 

QEMPKMHNSE ISKRLGABWK UiSETEXRPF IDEAKRLRAL HHKEHPDYKY RPRRKTKTLM X20 

KKDKTTIiPGG LLAPGGKSMA SGVGVGAGLG AGVNQRMDSY AHMHGWSNGS YSMMQDQLGY 180 

PQHF6UIAEG AAQMQFMHRY DV8AI4YNSM TSSQTY?0I6S PTYSMSYSQQ GTFGHAL6SM 240 

GSWKSEASS SPFWTSSSR SRAPGQACa)!* RSHISKYLPG AEVPEPAAPS RLHMSQBYQS 300 
GPVPGTAING TLPLSHM 

Seq ID NO: 5 DNA sequence 
Nucleic Acid Accession «: U91618 
Coding sequence: 29-541 

1 11 21 31 41 51 

I I I I I I 

GGGACTTOaC TTQTTAOAAO 6CTGAAAGAT GAIGGCAG6A ATGAAAATCC A6CTTGTATG 60 

CATGCrACTC CTG6CTTTCA GCTCCTGGeAG TCTGTGCtCA GATTCAGAAQ AG6AAATGAA 120 

AGCATTAGAA GCAGATTTCT TGACCAATAT GCATACATCA AAGATTAGTA AAGCACATGT 180 

TCCCTCTTGG AAGATGACTC TGCTAAATGT TTGCAOTCTT GTAAATAATT TGAACAGCCC 240 

AGCT6A6GAA ACAG6AGAA6 TTCAT6AAGA G6A6CTTGTT GCAAGAAGGA AACTTCCTAC 300 

TGCTTTAGAT GQCTTTAGCT TGGAAGCAAT GTTGACAATA TACCAGCTCC ACAAAATCT6 360 

TCACAOCAOG 6CTTTTCAAC ACTG6GAGTT AATCCAGGAA GATATTCTTG ATACTGGAAA 420 

TGACAAAAAT GGAAAGGAAG AAGTCATAAA GASAAAAATT GCTTATATTC TGAAACGGCA 480 

GCTGTATGAG AATAAACCCA GAAGACCCTA CATACTCAAA AGAGATTCTT ACTATTACTG 540 

AGAGAATAAA TCATTTATTr ACATGTGATT GTGATTCATC ATCCCTTAAT TAAATATCAA 600 

ATTATATTTG TQTGAAAATO TGACAAACAC ACTTATCXGT CTCTTCTACA ATTGTGGTTT 660 

ATT6AATGT6 TTTTTCTGCA CTAATAGAAA TTAGACTAAG TGTTTTCAAA TAAATCXAAA 720 
TCTTCAAAAA AAAAAAAAAA AAATGGG6CC GCAATT 



Seq ID NO 1 6 Protein sequence: 
Protein Accession «: AABS0564 

1 11 21 31 41 51 

I I I I I I 

KMAGMKIQLV CKUiLAFSSW SLCSDSEEEM KALSADFLTN MBTSKISKAH VPSWKMTLLN 60 

VCSLVNNI^S PAEBT6EVHB EBbVARRKLP TALDGFSIiEA MLTIYQLHKI CHSRAFQHWB 120 
LIQEDILDTG NDKKGKEEVI KRKIPYILKR QLYEMKPRRF YILRRDSYYV 

Seq ID NO: 7 DNA sequence 

Nucleic Acid Accession ft: NM_006536.2 

Coding sequence: 109-2940 

1 11 21 31 41 51 

I I I I I I 

ACCTAAAACC TTGCAAGTTC AGGAAGAAAC CATCT6CATC CATATT6AAA ACCTGACACA 60 

ATGTATGCAO CAGGCTCAGT GTQAGTGAAC TGGAGGCTTC TCTACAACAT GACCCAAAGG 120 

AGCATTGCAG GTCCTATTTG CAACCTGAAG TTTGTGACTC TCCTGGTT6C CTTAAGTTCA 180 

6AACTCCCAT TCCTGGGAGC TGGAGTACAG CTTCAAGACA ATGGGTATAA TGGATTGCTC 240 

A7T6CAATTA ATCCTCAGGT ACCTGAGAAT . CA6AACCTCA TCTCAAACA7 TAA6GAAATG 300 

ATAACTGAAG CTTCATTTTA CCTATTTAAT GCTAOCAAGA GAAGAGTATT TTTCAGAAAT 360 

ATAAASATTT TAATACCTGC CACATGQAAA GCTAATAATA ACAGCAAAAT AAAAGAAGAA 420 

TCATATGAAA AGGCAAATGT CATAGTGACT GACTGGTATG OGGCACATGG A6ATGATCCA 480 

TACACCCTAC AATACAGAOG GTQTG6AAAA GAGGGAAAAT ACATTCATTT CACACCTAAT 540 

TTCCTACT6A ATSATAACTT AACAGCT6GC TAOQGATCAC GAGGCOQAGT GTTTQTCCAT 600 

GAATGGGCCC ACCTCCGTTG G6GT6TGTTC GATGAOrrATA ACAAT6ACAA ACCTTTCTAC 660 

ATAAATGGGC AAAATCAAAT TAAAGTGACA AGOTGTTCAT CT6ACATCAC AGGCATTTTT 720 

GTGTGTGAAA AAGGTCCTTO CCCCCAAGAA AACTGTATTA TTAGTAAGCT TTTTAAAGAA 780 

GGATGCACCT TTATCTACAA TAGCAOCCAA AATGCAACTO CATCAATAAT GTTCATGCAA 840 

AGTTTATCTT CTGTGGTTGA ATTTTGTAAT 6CAAGTACCC ACAACCAAGA AOCACCAAAC 900 

CTACSVGAAOC AGAT6TGCAG CCTCA6AAGT GCATGQ6ATG TAATCAGAOA CTCTGCTGAC 960 

TTTCACCACA GCTTTCCCAT GAATGOQACT GAOCTTCCAC CTCCTCCCAC ATTCTCGCTT 1020 

GTACAOGCTG GTGACAAAGT GGTCTGTTTA GTGCTGGATG TGTCCAGCAA GATGGCAGAG 1080 

GCTGACAGAC TCCTTCAACT ACAACAAGCC GCAGAATTTT ATTTGATGCA GATTGITGAA 1140 

ATTCATACCr TOQTOGGCAT T60CAGTTTC GAOGCAAAG GAQA6ATCAG AGCCOUSCTA 1200 

CAOCAAATTA ACAGCAATGA TGATCGAAAG TTQCT6GTTT CAXATCTGCC CACCACTGTA 1260 

TCAGCTAAAA CAGACATCAG CATTTGTTCA GGGCTTAAGA AAGGATTTGA GGTGQTTGAA 1320 

AAACTGAATC GAAAAGCTTA TGGCTCTGTG ATQ^TATTAG TGACCAGOGG AGATGATAAG 1380 

CTTCTTGGCA ATTGCTTACC CACTCTGCTC AOCAGTGGTT CAACAATTCA CTCCATTGCC 1440 
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CTGGGTTO^T CTGCAGCOOC AAftTCIGGAG GAATTATCAC 

rxvmvn ' C cywsATATAtc aaactccaat agca.tgattg 

TCTGGAACTG GAGACATTTT CCAGCAACAT ATTCAGCTTG 
AAAOCTCACC ATCAATTGAA AAACAOVGTG ACTGT6GATA 
ATGTTTCTA6 TTACX7T66CA G60CA6TGGT CCTOCTGAGA 
G6AOSVAAAT ACTAC9U»AA TAATTTTATC ACCAATCTAA 
TGGATrCX3U3 (SAACA6CTAA GCCT6GGCAC TCGACTTACA 
TCTCTGCAAG CCCTGAAAGT GACAGTGACC TCTOGOSCCT 
6CCACTGTGG AAGCCTTTCT GGAAAGAGAC AGCCTCCATT 
TAT60CAATG TGAAACAGG6 ATTTTATCCC ATTCTTAATG 
(3U30CAGA6A CTG6A6AT0C 7GTTACX3CTO AfiACTOCTTG 
GTTATAAAAA ATGATGGAAT TTACTCGAGG TATTTTTTCT 
TATAGCTTGA AAGTGCATGT CAATCACTCT CCCAGCATAA 
CCA6G6AGTC ATGCTATGTA TGTACXAGGT TACACA8CAA 
GCTCCAA6GA AATCAGTAGG CAGAAATGAG GAGGAGCGAA 
AGCTCAGGAG GCTCCTTTTC AGT6CTGGGA GTTCCAGCTG 
CCACCAT6CA AAATTATTGA CCTGGAAGCT GTAAAAGTAG 
TGGACAGCAC CTGGAGAAGA CTTTGATCAG GGCCAGGCTA 
AGTAAAAGTC TACAGAATAT CCAAGATGAC TTTAACAATG 
AAGCGAAATC CTCAGCAAGC TGGCATCAGG GAGATATTTA 
ACGAATGGAC CTGAACATCA GCCAAATQGA GAAACACATG 
6CAATA0GAG CAAT6GATAG GAACTCCTTA CAGTCTGCTG 
OCTCTGTTTA TTCCCCCCAA TTCTGATCCT GTACCTGCCA 
GQAtJTTTTAA CAGCAATGGG TTTGATAGGA ATCATTTGCC 
CATACTTTAA GCAGGAAAAA GAGAGCAGAC AAGAAAGAGA 
ATAAATATCC AAAGTGTCTT CCTTCTTAGA TATAAGACCC 
CATACTAACA AAGTCAAATT AACATCAAAA CTGTATTAAA 
ATACAGATAA GATTTTTACA TGGTAGATCA ACAATTCTTT 
OCTTACACTT TQGCTATGAA CAAATAATAA AAATTATTCT 
6CAAAGGGAA GGGTAAAGTC GGACCAGTCT CAAGGAAA6T 
AATAGCCCCA AGCAGAGAAA AGGAGGGTAG GTCTGCATTA 
TCATTTAGTT ACTTTGATTA ATTTTTCrrT TCTCCTTATC 
TTTACATQAA GATCATGCTA TATTTTATAT AITGTAOCCCC 
CTTGCTATTT TGTTATATAT ATTTCAGAT6 ACATCTCCCT 
TTTCACTGTA A6AG6TAACC TTTAACAATA TGGGTATTAC 
TTTATGACAA AGGTCTATTG AATTTATTTG TNTGTAAGTT 
TTTCTAAGTT TATTGCCTTG GGTTATTATG GAATGATAGT 
TACCTAGGAA A 

8eq ZD MOt B Protein sequence: 
Protein Accession NP_006527.l 



PCT/US02/12476 



GTCTTACAO6 

AIGCTTTCAG 
AAAGTACAGG 
ATACTGTGGO 
TTATATTATT 
CTTTTGGGAC 
CCCIUAACAA 
CCAACTCAGC 
TTCCTCATCC 
CCACTGTCAC 
ATGATG SAGC 
CCTTTGCTGC 
GCACCCCAAC 
AOQGTAATAT 
AGTGGGGCTT 
GCOOCCACCC 
AAGAG6AATT 
CAAGCTATGA 
CTATTTTAGT 
CGTTCTCACC 
AAAGCXACAO 
TATCTAACAT 
GAGATTATCT 
TTATTATAGT 
ATGGAACAAA 
AT6GCCTT0G 
ATGCATTGAG 
TTGGGGGTAG 
TTAAAGTAAT 
rTGTTTTATT 
TAACTGTCTG 
TGTGCAGTAC 
TAATGCAAAG 
GCTAATGCTC 

TCTACTCCCA 
TATAGCCCCN 



AGCHTTAAAG 
TAGAATTTCC 
TGAAAATGTC 
CAACGACACT 
TGATCCTGAT 
AGCTAGTCXT 
TACCCATCAT 
TGT6CCCCCA 
TGTGATGATT 
TGCCACAGTT 
AGGTGCTGAT 
AAATGGTAGA 
CCACTCTATT 
TCAGATGAAT 
TAGCOGAGTC 
TGATGTGTTT 
GACCCTATCT 
AATAAGAATG 
AAATACATCA 
CCA6ATTTCC 
AATTTAT3TT 
TGCCXAOGG6 
TATATTGAAA 
TGTGACACAT 
ATTATTATAA 
ACTACAAAAA 
TTTTTGTACA 
ATTAGAAAAC 
GTCTTTAAAG 
GAGGTGGAAA 
TGTGAAGCAA 
AGGTTGCTTG 
CTCTTTACCT 
AGA6ATCTTT 
TCATACXMGT 
TCAAAGCAGC 
TATAATGCXrr 



1 

I 

MTQRSZAGPI 
XKEMITEASF 
ODPYTLOYR 
KPFYINGOKQ 
MFMQSLSSW 
TFSLVQAGDK 
RAQLHQIKSH 
GDDKLLGKCL 
SRISSGTGDI 
PDPDGRinrVT 
AVPPATVEAP 
AGADVIKNDG 
IQHKAPR2CSV 
LTLSNTAPGE 
PQI8TNQPEH 
LZLXGVLTAM 



11 

I 

OILKFVTLLV 
YLFliATKRRV 
GOGKBGKYIH 
lEVTRCSSDI 
EFOaASTRNQ 
WCLVLDVSS 
DDRKLIiVSYL 
ptvlss6stz 
FQQHIQIjEST 
MNFITNLTFR 
VERDSLHFFH 
lYSRYFPSFA 
GRNEEERKWG 
DFDQ6QAT&Y 
QPNGBXHESH 
GLIGIZCLIZ 



21 
I 

ALSSELFFLG 
FPRNIKILIP 
FTPNFLLNDN 
TGIFVCEKGP 
BAPNLQMQMC 
KMAEADRLLQ 
PTTVSAKTDI 
HSIAJiGSSAA 
GENVKPBBQL 
TASLNIPGTA 
PVMIYANVKQ 
AMGRYSLKVH 
FSRVSSGGSF 
BIRMSKSLQIf 
RIYVMRAMD 
WTHHTLSRK 



31 

I 

AGVQLQDNGY 
ATHKAKMNSK 
LTAGYGSRGR 
CPQENCXZSR 
8LRSAHDVZT 
LQQAAEFYU4 
SICSGLKKGF 
FNIiEEXtSRLT 
KNTVTVDinV 
KPCanfTYTLN 
GPYPILNATV 
VKHSPSISTP 
SVLGVPAGPH 
IQQDFmDaij 
RNSLQSAVSN 
KRADKKENGT 



41 

I 

NGliLIAINPQ 
XKQESYEKAN 
VFVREWABLR 
LFXBOCTFIY 
DSADFHHSFP 
QIVEIHTFVG 
EWEKLNGKA 
GGliKFFVPDI 
GtiUtnkLVTH 
NTBBSIiQAIiK 
TATVBPETGD 
AE5IPGSHAM 
PDVFPPCKII 
VNTSKRMPQQ 
ZAQAPLFZPP 
KLL 



51 
I 

VPENQKLISN 
VIVTDWYGAH 
HGVFDEYMND 
NSTQKATA8Z 
MNGTBLPPPP 
IA5FDSKGEI 
YGSVMILVTS 
SNSKSMIDAF 
OASGPPBZZXi 
VTVTSRASRS 
PVTLRIiliDDG 
YVPGYTANCai 
DLBAVKVEEE 
AOIRBZFTFS 
HSDPVPAKDY 



1500 
1560 
1620 
16B0 
1740 
IBOO 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
330O 
3360 
3420 
3480 
3540 
3600 
3660 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
760 
B40 
900 



8eq ZD HO 1 9 DITA sequence 
^ nucleic Acid Accession it Eos sequence 
65 Goding sequence! 336-632 

1 11 21 31 41 51 

I I I I I I 

CTCOOCTCAC CCOGGTCCAG GATGCOCAGT COOCaOGACa OCTOCCACTT CCCACTOTGO 60 

70 CCTGGGTGGG CTCAGGGGCT GCCCTTGACC TG0CCTA6AG CXXTOTCCCA GCTGGTGGTO 120 

GAGCTGGCAC TCTCTGGGAG GGAGGGGGCT GGGAGGGAAT GAGTGGGAAT GGCAAGAG6C 180 

CAGGGTTTG6 TGGGATCAG6 TTGAGGCAG6 TTT6GTTTCC TTAAAATGCC AAGTTGGGGG 240 

0CAGT08Q0C CCACATATAA ATOCTCACCC T8GGA6CCTG GCTGOCTTGC TCTCCTTOCT 300 

GGGTCTGTCT CTCCCACCTO GTCTGCCACA GATCCATGAT QTGCAGTTCr CT6QAGCAG0 360 

75 CGCTGGCTGT GCTGGTCACT ACCTTCCAC3V AGTACTCCTG CCAAGAGGGC GACAAOTTCA 420 

AGCTGA6TAA GGGGGAAATG AAGGAACTTC TGCACAAGGA 6CTGCCCAGC TTTG7GGGGG 480 

AGAAA6TGGA TGAGGAGGGG CTGAAGAAGC TGATGGGCAG CXrZGGATGAG AACA6TGACC 540 

AGCAGGT6GA CTTCCAGQAG TATOCTGTTT TCCTG6CACT C!ATCACTGTC AT6TGCAAT6 600 

ACTTCrrCC3^ GGGCTGCCCA 6AC0GACCCT GAAGCAGAAC TCTTGACTTC CT6CCATGGA 660 

80 TCTCTTGGGC CCAGGACTGT TQATGCCTTT GAGTTTTGTA TTCAATAAAC TTTTTTTGTC 720 

TGTTGATAAT ATTTTAATTG CTCAGTGATG TTCCATAACC CGGCTGGCTC AGCTGGAGTO 780 

CTGG6ASATG AG6GCCTCCT GGATCCTGCT COCTTCTGGG CTCTGACTCT CCTGGAAATC 840 

TCTCCAAGGC CAGAGCtATG CTTTAGGTCT CAATTTT6GA ATTTC3^AACA CCAGCAAAAA 900 

ATTGGAAATC GAGATAGGTT GCTGACTTTT ATTTTGTCAA ATAAA8ATAT TAAAAAAGQC 960 

o5 AAATACCA 

8eq ZD NO: 10 Protein sequence: 
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Protein Accession ft: KP_005969.1 



PCTAJS02/12476 



1 11 21 31 41 51 

.1 1 1 1 I I 

^ MMCSSLBQAL AVLVTTFHKY SCQEGDKFKL SK6EMKELLE KELPSFVGEK VDEEGLKKLM 60 
GSUENSDQQ VDFQEYAVFI. ALITVMQ3DF FQGCPORP 

^ Seq ID NO I IX DNA sequence 
10 Nucleic Acid Accession #: Eos sequence 
Coding sequence: 336-626 

1 11 21 31 41 51 

1^ 1 1 1 I I I 

IJ CTCCCCTCAC CCOGGTCCAG GAT6CCCA6T CCCCAOGACA CCTCCCACTT CCCACTGTGG 60 

0CTG6GT6GG CTCAGGGGCT GCGCTTGACC TGGOCTAGAS COCTOCCC CA GCTGGTGGTO 120 

GAGCTGQCAC TCTCTG6GAG GGAGGGGGCT 6GGA6G6AAT GAGT6GGAAT G6CAAQAGGC 180 

CAGGGTTTGG TGGGATCAGG TTGAGGCAGG TTTGGTTTCC TTAAAATGCC AAGTTGOGGG 240 

^ (XACnXSGGGC CCACATATAA ATCCTCACCC TGGGAGCCTO GCTGCCTTGC TCTCCTTCCT 300 

20 GGGTCTCrrCT CT6CCACCT6 GTCTGCCACA GATCXIATGAT GTGCAGTTCT CTOGAGCAGG 360 

CGCTGGCTGT GCTGGTCACT ACCTTCCACA AGTACTCCTG OCAAGAGGGC 6ACAAGTTCA 420 

AGCTOAGTAA GGOGGAAATG AAGGAACTTC TX3CACAAGGA GCTGCCCAGC TTTGTG06GC 480 

ATTCCAGAGA AOCATGTGCT GTGAGGGCCT TCCX»GTCCA TCTOTTTAAT CCTGTCATTG 540 

GAGACTTGAG AAACXAGAGC CCAGAAGGGA AAAGTGATTG TCCCAAGATC ACACAGCACT 600 

25 GGAGAAAGTG GATGAGGAGG GGCTGAAGAA 6CTGATGGGC AGCCTGGATG AGAACAGTGA 660 

CCAGCAGGTG GACTTCCAGG AGTATGCTGT TTTCCTGGCA CTCATCACTG TCATCTGCAA 720 

TGACTTCTTC CAGGGCTGCC CAGACC6ACC CTGAAGCAGA ACTCTTGACT TCCTGCCATG 780 

GATCTCTTGG GCCCAGGACT GTTGATGCCT TTGAGTTTTG TATTCAATAA ACTTTTTTTG 840 

^ TCTGTTGATA ATATTTTAAT TGCTCAGTGA TGTTCCATAA CCOGGCTGGC TCAGCrGGAG 900 

30 TGCTGGGAGA TGAGGGCCTC CTGGATCCTG CTCCCTTCrG GOCTCPGACT CTCCTGGAAA 960 

TCTCTCCAAG GCCAGAGCTA TGCTTTAGBT CTCAATTTTG GAATTTCAAA CACCAGCAAA 1020 

AAATTGGAAA TCGAGATAGG TT6CTGACTT TTATTTTGTC AAATAAAGAT ATTAAAAAAG 1060 
GCAAATACCA 

35 Seq ID MO: 12 Protein sequence! 

Protein Accession ftt Bos sequence 

1 11 21 31 41 51 

• J ' ' ' ' 

4U i4MCSSLEQAL AVLVTTFBKY SCQEGDKFKL SK6EMKBLLH KELPSFVGBS REPCAVRAFR 60 

VBLFMPVim LRNQSPEGKS DCPKITQUHR KWMRR6 



. - Seq ID NO: 13 DNA sequence 
45 Nucleic Acid Accession #t Eos sequence 
Ooding sequence: 58-354 

1 11 21 31 41 51 

' ' > ' ' ' 

DU GTGAGCTCAC CATGTGGGGG TGAGGCTGAG AGAAAACAAG TACACAGCCA CAGATCCATG 60 

ATGTGCAGTT CTCTGGAGCA GGCGCTGGCT GTGCTGGTCA CTACCTTCCA CAAGTACTCC 120 

TGCCAAGAGG GOGACAAGTT CAAGCTGAGT AAGGGGGAAA TGAAGGAACT TCTGCACAAG 180 

GAGCTGCCCA 0CTTT6TGG0 GGAGAAAGTG GATOAGGAGO GGCTGA AGAA GCTGATG GGC 240 

AGCCTGGATG AGAACAGTGA CCAGCAGGTG GACTTCCAGG AGTATGCTGT TTTOCTGGCA 300 

55 CTCATCACTG TCATGTGCAA TGACTTCTTC CAGGGCTGCC CAGACCGACC CTGAAGCAGA 360 

ACTCTTGACT TCCTGCCATG GATCTCTTGG GCCCAGGACT GTTGATGCCT TTGAGTTTTG 420 

TATTCAATAA ACTTTTTTTG TCTGTTGATA ATATTTTAAT TGCTCAGTGA TGTTCCATAA 480 

C0GQGCT6GC TCAGCTGGAG TOCTaGGAGA TGAGGGCCTC CTGGATCCTG CTCCCTTCTG 540 

G6CTCTGACT CT0CT6GAAA TCTCTCCAAG GCCAGAGCTA TGCTTTAOGT CTCAATTTTG 600 

60 GAATTTCAAA CACCAGCAAA AAATTGGAAA TCGAGATAGG TTGCTGACTT TTATTTTGTC 660 
AAATAAAGAT ATTAAAAAAG GCAAATAOCA 



65 
70 



Seq ID NO: 14 Protein sequence: 
Protein Accession ft: NP_005969.1 

1 11 21 31 41 51 

I 1 1)11 

MMC8SLEQAL AVLVTTFHKY SCQEGDKFKL SKGEMKELLH KELPSFVGEK VDEB6LKKLM 60 
GSLDQfSDQQ VDFQEYAVFL ALXTVMCNDF FQOCPDRP 

Seq ID NO: 15 DKA sequence 

nucleic Acid Accession ft: Bos sequence 

Ooding sequence: 62-358 

75 1 11 21 31 41 51 

1 1 I 1 i ) 

GGAGGGTGT6 C06CTGAGTC ACTGCCT GG G CATCTGGGCC TQGAACCTGG GCCACAGATC 60 

CATGATGT6C A6TTCTCTGG AGCAGOGOCT GG C TGTGCTG GTCACTACCT TCCACAA6TA 120 

CTOCTGCCAA GA6GG0SACA AGTTCAAOCT GA6TAAGG0G GAAATGAAGG AACTTCTGCA 180 

oO CAA6GA6CTG CCCAGCTTTG T6GGGGAGAA AGTGGATGA6 GAG6G6CTGA AGAAGCTGAT 240 

0G6CAGCCTG GATGA6AACA GTGACCAGCA G6TGGACTTC CAGGAGTATG CTGTTTTCCT 300 

GGCACTCATC ACTGTCATGT GCAATGACTT CTT0CAGG6C TGCCCAGACC GACCCTGAA6 360 

CAGAACTCTT GACTTCCTGC CAT66ATCTC T TOqC C OC A g QACTGTTQAT GCCTTTGAGT 420 

TTTGTATTCA ATAAACTTTT ■ITrU ' AVlOlT GATAATATTT TAATTGCTCA GTGA IG TTCC 480 

ATAACCOGGC TGGCTCA6CT GGAGTGCTGG GAQATGAGGG CCTCCTGGAT OCTGCTCCCT 540 

TCTGGGCTCT GACTCTOCTG GAAATCTCTC CAAGGCCAGA GCTATGCTTT AGGTCTCAAT 600 

TTTQGAATTT CAAACAOCAG CAAAAAATTG SUIATOSAGA TAG6TTGCTG ACTTTTATTT 660 



8S 
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wo 02/086443 

TGTC]UUV^A^ agatatta^a aaaggcaaat aoca 

Seq ID NO: 16 Protein sequence: 
Protein AccessiOD i: NP_00S969.1 

1 11 21 31 41 51 

i I i I 1 1 

KMCSSLEQAL AVLVTTFHiar SOQEXSKFKL SKGEMKELLE KEZiPSFVGEK VSEEX3LKKLM 60 

GSLOEtTSDQQ VSFQEYAVFL AliITVKCNDP FQGCPDRP 



Seq ID KO: 1? DKA sequence 

NUcleic Acid Accession 9: Eos sequence 

Coding sequejice: 939-2372 

I 11 21 31 41 51 

II I I I I 

AAGACGGATT CTCAGACAAG GCTTGCAAAT GCCCCGCAGC CATCATTTAA CTGCACCOGC 60 

AGAATAGTTA CGGTrTCTCA CCOGACCCTC CCGGATOGCC TAATTTGTCC CTAGTGAGAC 120 

CCCGAGGCTC TGCCCGCGCC TGGCTTCTTC GTAGCTGGAT GCATATCGTG CTCOGGGCAG ISO 

CXXX3QGCGCA GGGCACGCGT TCGCGCACAC CCTAGCACAC ATGAACACGC GCAAGAGCTG 240 

AACCAAGCAC GGTTTCCaTT TCAAAAAGGG AGACAGCCTC TACCGCGATT GTAGAAGAGA 300 

CTGTGGTGTG AATTAGGGAC CJGGGAGGOST CGAACX3GAGG AAC3GGTTCAT CTTAGAGACT 360 

AATTTTCTGG AGTTTCTGCC CCT6CTCTGC GTCAGCCCTC AOGTCACTTC GCCAGCAGTA 420 

GCAGAOGOGQ CGGCGGCGGC TCCCGGAATT GGGTTGGAGC AGGAGCCTCG CTGGCTGCTT 4B0 

CGCTCOOJCT CTAC23CGCTC AGTCCCCGGC GGTAGCAGGA GCCTG(3«XX: AGGCGCCXSCX: 540 

GGCGGGOCTG AGGOGCOGGA GCCCGGCCTC GAGGTGCATA CCX3GACCCCC ATTCQCATCT 600 

AACAAGGAAT CTGOGCCCCA GAGAGTCCOS GGAGC6CXX3C CGGTCGGT6C CCGGGGCXXX: 660 

G6GCCAT6CA GCGAOGGCOG CCGCGGAGCT CCX3AGCAG06 GTAGCGCCXX: CCTGTAAAGC 720 

GGTTC6CTAT 6C0GGGG0CA CT6TGAAC0C TGOOGCCTSC GGGAACACTC TTOGCTCX366 780 

AC3CAGCTCAG CCTCT6ATAA GCTGGACTCa GCA08CCC6C AACAAGCACC GAGGAGTTAA 840 

GAGAGCCGCA A6CGCAGGGA AGGCCTCCXT GCACGGGTGG GGGAAAGCX36 CCGGTGCAGC 900 

GCGGGGACAG GCACTCGGGC TGGCACTGGC TGCTAGGGAT GTCGTCCTGQ ATAAGGTGGC 960 

ATGGACC06C CAT6G0G0GG CTCTGGG6CT TCTGCTGGCT G6TTGTGGGC TTCTGGAGGO 1020 

OO GC TTTCGC CT6TCCCA08 TCCTGCAAAT GC3U3IGCCTC TOSGATCTQG TGCAOOSACC 1080 

CTTCTCCT GG CAT06TGGCA TTTCC6A6AT TGGAGCCTAA GA6T6TA6AT CCTGA6AACA 1140 

TCACCGAAAT TTTCATOGCA AACCAGAAAA GGTTAGAAAT CATCAAOSAA GATGATGTTG 1200 

AAGCTTATGT GGGACTGAGA AATCT6ACAA TTGTGGATTC TGGATXAAAA TTTGTGGCTC 1260 

ATAAA6CATT TCTGAAAAAC A6CAACCTGC AGCACATCAA TTTTACCOGA AACAAACTGA 1320 

G6AGT7TGTC TA6GAAACAT TTCG6TCACC TTQACTTOTC TGAACTGATC CTGGTGGGCA 1380 

ATGCATTTAC ATGCTCCTGT GACATTATGT GGATCAASAC TCTCCAAGAG GCTAAATGCA 1440 

GTCCA6ACAC TCAQGATTTG TACT6CCT6A ATGAAAGCAG CAAGAATATT CCCCTGGCAA 1500 

ACCTGCAGAT ACCCAATTGT GGTTTGCCAT CTGCAAATCT GGCCGCACCT AACCTCACTG 1560 

TGGAGGAAGG AAAGTCTATC ACATTATCCT GTAGTGT6GC AGGTGATCOG GTTCCTAATA 1620 

TGTATT6GGA T6TTG6TAAC CTGGTTTCCA AACATAT6AA TGAAACAAGC CACACACAOG 1680 

GCTCCTTAAG GATAACTAAC ATTTCATCOO ATGACAGTGG GAAGCAGATC TCTTGTGTGG 1740 

OGGAAAATCT TGTAGGAGAA GATCAAGATT CTGTCAACCT CACTGTGCAT TTTGCACCAA 1800 

CTATCaCATT TCTCGAATCT CCAACCTCAG ACCACCACTG GTGCATTCCA TTCACTGTGA 1860 

AAGGCAACCC CAAACCA60G CTTCAGTGGT TCTATAACG6 G6CAATATTG AATGAGTCCA 1920 

AATACATCTQ TACTAAAATA CATGTTACCA ATCACACOGA 0TACCAC6GC TGCCTCCAOC 1980 

TGGATAATGC CACTCACATG AACAATGGGG ACTACACTCT AATA6CCAAS AATGAGTATG 2040 

GGAAGGATGA GAAACAGATT TCTGCTCACT TCATGGGCT6 GCCTGGAATT GAOGATGGTG 2100 

CAAACCCAAA TTATCCTGAT GTAATTTATG AAGATTATGG AACTGCAGOS AATGACATOG 2160 

GOGACACCAC GAACAGAAGT AATGAAATCC CTTCCACAGA OSTCACTGAT AAAAC06GTC 2220 

GQSAA CATCT CTOGGTCTAT GC TGTGS TGG TCATTGCGTC TGTGGT GGOA Tm ' OCCATr 2280 

TGGTAATGCT 6TTTCTGCTT AA6TTGGCAA GACACTCCAA 6TTT6GCAT0 AAAGGTTTT6 2340 

TTTTGTTTCA TAAGATCCCA CTQGATGGGT A6CT6AAATA AAGGAAAAGA CAGAGAAAGG 2400 

GGCTGTGGTG CTTGTTGGTT QATGCTGCCA TGTAAGCTGG ACTCCTGGQA CT6CTGTTGG 2460 

CTTATCC0G6 GAAGT6CTGC TTATCTGGG6 TTTTCTGGTA GATGTGGGCG GTGTTTGGAG 2520 

6CTGTACZAT A3GAA6CCT6 CATATACTGT GA6CT6TGA7 TGGGGAACAC CAATGCAOAG 2580 

GTAACTCTCA GGCAGCTAA6 CAG(3UXnX3l AGAAAACATG TTAAATTAAT GCTTCTCTTC 2640 

TTACAGTACT TC3JVATAOA AACTOAAATG AAATCCCATT GGATTGTACT TCTCTTCTGA 2700 

AAA6TGT6CT TTTTGACOCT ACTQQACATT TATTGACTTA ATTGCTTCTG TTTATTAAAA 2760 

TTQACCTGCA AAGTTAAAAA AAAATTAAAG TTGA6AACAG 6TATAAGT6C ACACTGAATA 2820 

GTCTAATCTA GAItSTAACAC ATATTTTAGT GTGATTTTCT ATACTCTAAT CAGCACltSAA 2680 

TTCA6AG6GT TT6ACTTTTT CATCTATAAC ACAGTGACTA AAA6A6TTAA GGGTATATAT 2940 

ACCATCACTT TGGGACTT66 TAGTATTATT AAAAGGTTAT TTCCTTCACT GTCAATAAAA 3000 

GTCCAAATGT TTAGCTTAOG TCTGAGAGTC AAACAATGTT AAGGATTGTC TTAAAGTTCC 3060 

TTAGCCA6CA AAACAAAACA AAACAAAACA AACAAATGAA AAACGTTTAA AAAGAAGAAG 3120 

AA6AAAAAAA ACAASAACAA 0CA6CAACA6 CWrmUiT GGG6CTATAG ATTTAAGTTA 3180 

GGCATAGTCA ATTTCAGAAT AACTAAGAGT GGAATATATG CATAT6GTGA AATTATAACC 3240 

TTGCCCTTTT TTATTTGCCC TCTGC3GATCC ACCTGCTTTT TAGAAGTCTG COGAGTGAGA 3300 

AQGCCACAGT ATCTCATGCT GTTTGCATTA CAGAACTOCA GCTTTTCTAC TCTGAAAAOG 3360 

CCTGGGAGCA GAATG8CTGG CCTGCTGICA GC3U36A6A66 AGATTCTAAG AAGGATAGTC 3420 

COCCCIAGAA CATACTGTCA TACT6CIGG6 'i'LTVCKTOOG TAGGAAAGCT TGTCCTGACC 3480 

CCAGCAGCAA AGA6GT6GCA GGTC6CTAAT GAATATAT6C TTTATAATGT CCTTCTTCAT 3540 

TGCTGA GAGG GCAGCCTTAG AGCTGTGGAT TTCTGCATCC CXXCTQAQTC TGACCCATGG 3600 

ACACCTGTTT CATTCACTTT AGCATCACAG TGACCTTTGT ATGCTCTGTT C3VGTCTGTGT 3660 

CA6GCAGTAT GCTTGTCCTG AAGAGAQGTT TGGCTATCCC CACCCCACCC CACCCCACCC 3720 

TGTTCCTTTT TTATCM3GA0 GACTTCAGA6 CCAGGCCTGC AGCATTTT6T TTGAAAACAC 3780 

AATCA6CTCT GACAGTTAQA CATGCACACA GACGCCATAG CTG6ATTGGA AACATTGATG 3840 

TTTTAAAAAT TTATTTTTTT TGGAAATAGT TGCACAAATG CT6CAATTTA OCTTTAACGT 3900 

TCTATAGATT TTTAACTAGT CCAACACAGT CAGAAACATT GTTTTGAATC CTCTGTAAAC 3960 

CAAGGCATTA ATCTTAATAA ACCAGGATCC ATTTAGGTAC CACTTGATAT AAAAAGGATA 4020 

TCCATAATGA ATATTTTATA CTGCATCCTT TACATTAGCC ACTAAATACG TTATTGCTTG 4080 

ATGAAOACCT TTCACAGAAT CCTATGGATT GCAGCATTTC ACTTGGCTAC TTCATACCCA 4140 
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TGGCTTAAAQ AGOGGCftGTT TCTCAAAA6C AGUACAIGC OGOCftGTTCT CAAGTTTTOC 4200 

TCCTAACTOC ATTTSAATGT AAGGGCftGCT GGCCCCCAAT GTGGGGAGGT COGAACATTT 4260 

TCTGAATTCC CATTTTCTTG TTOGCGGCTA AATGACAGTT TCTGTCATTA CTTAGATTCC 4320 

GATCTTTCCC AAAGGTGTTG ATTTACAAAG AGGCCACSCTA ATAGCAGAAA TCATGACCCT 4380 

GAAAGABABA TGAAATTCAA GC7IGTGAGCC AGGCAGQAGC TCAGTATGGC AAAGGTTCTT 4440 

6AGAATCAGC CATTTC6TAC AAAAAAGATT TTTAAAGCTT TTATCTTATA CCATCGA6CC 4500 

ATAGAAAOGC TATGGATTGT TTAAGAACTA TTTTAAAGTG TTCCM3ACGC AAAAAG6AAA 4560 

AATAAAAAAA AAGGAATATT TGTACCCAAC AGCTAGAAQG ATTCCAAGGT AGATTTTTGT 4620 

TTTAAAATGG AGAGAAGTGG ACAGATAAGG CCATTTAATA TATCftAAGAT CAGnGACAT 4680 
CTCCTAGGGA ATGATXSAAAA CAGCAGGCTA T 

Seq ID NOi 18 Protein sequence: 
Protein Accession 8: CAA53571 

1 11 21 31 41 51 

i I 1 I I 1 

MSSWIRWHGP AMARLW6FCW LWGPWPAAP ACPTSCKCSA SRIWCSDPSP GIVAFPRLBP 60 

NSVDPBNITB IPIANQKRLE IINEDDVEAY VGLHNLTIVD SGLKPVAKKA FLKNSNLQHI . 120 

NFTSNKLTSL SRKHFRHLOL SELXLVGNPF TC5CDIMKIK TLQEAfCSSPD TQDLYCLHES 160 

SKNIPLANLQ IPNCGLPSAN LAAPNLTVEE GKSITLSCSV AGDPVPNMYW DVGNLVSKHM 240 

NETSHTQGSL RITinSSDDS GKQISCVAEH LVGEDQDSVN LTVHFAPTIT FliESPTSDHH 300 

WCIPFTVKGN PKPALQWPYN GAIZJIBSKyi CTKIHVTiniT EYEGCLCLDN PTHM!QIGDyT 360 

LIAKHBYCKD HXQZ8AHFMG WPQZDDGANP HYPOVIYEDY GTAAHDIGDT THRSNEZPST 420 
DVTDKTGRra LSVYAVWIA SWGFCU.VM LFUiKXASBS KFGMXGPVLF HKZPLDG 



Seq ID NO: Id DHA sequence 
Nucleic Acid Accession #: NM_00022B 
Coding sequence: 82-3600 *~ 

1 11 21 31 41 51 

t I I 1 1 I 

GCTTTCAGGC GATCTGGAGA AAGAACGGCA GAACACACAQ CAAGGAAAGG TCCTTTCT6G 60 

GGATC3VCCCC ATTGGCTGAA GATGAOACCA TTCTTGCTCT TGTGTTTT6C CCTGCCTG6C 120 

CTCCTGOVTG CCCAACAAGC CTGCTCXX3GT G0660CTGCT ATCCACCTGT TGGGGACCTG 180 

CTTGTTGGGA GGACOOGGTT TCTCOGAGCT TCATCTACXn' GTGGACTGAC CAAGCCTGAG 240 

ACCTACTGCA CCCAGTATGG CGAGTGGCAG ATGAAATGCT GCAAGTGTGA CTCCAGGCAG 300 

CCTCACAACT ACTACAGTCA COGA6TAGAG AATGTGGCTT CATGCTCOGG CCCCATGC6C 360 

TGGTGGCAGT C0CAGAAT6A TGTGAACCXrr GTCTCTCTGC AQ CTOS ACCT GGACAGGA6A 420 

TTCCAGCTTC AAQAAGTCAT GATtSGAGTTC CAGGGSCOCA T60C08CX39G CATGCTGSATT 480 

GAGCGCrCCT CAGACTTCX3G TAAGACCTGG CGAGTGTACC AGTACCTQ6C TGCOGACTGC 540 

ACCTCCACCT TCCCTCGGGT CXXSCCAGGGT OGGCCTCAGA GCTGGCAGGA TGTTOGGTGC 600 

CAGTCCCTGC CTCA6AGGCC TAATGCACGC CTAAATGGGG GGAAGGTGCA ACTTAACCTT 660 

ATGGATTTA6 T G TCTGGGAT TCCAGCAACT CAAAGTCAAA AAATTCAAGA GGTGQ6GGA6 720 

ATCACAAACT TGAGA6TCAA TTTCACCAG6 CTG6CCCCIG T60C0CAAA6 GGGCTACCAC 780 

CCTCCCAGOG CCTACTATGC TGTGTCCCAG CTCOSTCTGC AGGGGAGCTG CTTCTGTCAC 840 

GGCCATGCTG ATCGCTGOSC ACCCAAGCCT GGGGCCTCTG CAGGCCCXTTC CACCGCTGTG 900 

CAC6TOCACX3 ATGTCTGTGT CTGCCAGCAC AACACTGCCG GCCCAAATTG TGAGCGCTGT 960 

QCACCCTTCT ACAACAACCG 6CCCTGGAGA C0GG06GAGG GCCAG6A00C CCATGAATGC 1020 

C31AAGGT6C6 ACTGCAATGG GCACTCA6A6 ACATGTCACT TTQACXXrOGC TGTGTTTGCC 10 BO 

6CCAGCCAGG GGGCATATGG AGGTGTGTGT GACAATTGCC GGGACCACAC CX5AAGGCAAG 1140 

AACTGTGAGC GGTGTCAGCT GCACTATTTC CGGAACCGGC GCCCGGGAGC TTCX31TTCAG 1200 

GA6A0CTGCA TCTCCTGCXSA GTGTGATCCG GATGGGGCAG TGCCAGOGGC TCCCTGT6AC 1260 

CCAGTGAC06 GGCAGTGTGT GTGCAAGGAO CATGTGCAGG GAGAGOQCTO TGACCIATQC 1320 

AA6C0GG6CT TCACTGGACT CA0CTAC6CC AACC0GCA6G GCTQCCACOG CTGT6ACTGC 1380 

AACATCCTGG GGTCCCGQAG GGACATGCCG TGTGAOGAGG AGAGTGGGOG CTGCCTTTGT 1440 

CTGCCCAACG TGGTGGGTCC CAAATGTGAC CAQTGTGCTC CCTACCACTG GAAGCTGGCC 15 DO 

A6TG0CCAGG GCT6TGAACC GTOTGCCTGC GACXTGCACA ACTCaXTTCA GCCCACAGTG 1560 

CAACCAGTTC ACAGGGCAGT 6C0CTGT0GG GAAGGCTTT6 GTGGCCTGAT GT6CAGCGCT 1620 

GCAGCCATCC GCCAGTGTCC A6ACCGGACC TATGGAGA06 TGGCCACAOQ ATGCOGAGCC 1680 

TGTGACTGTG ATTTCCGGGG AACAGAGGGC OCQGGCTGOQ ACAAGGCATC AGGCOOCTGC 1740 

CTCTGCCGCC CTGGCTTGAC CGGQCCCCGC TGTGACCAGT GCCAGCGAGG CTACTGCAAT 1800 

0GCrACX:060 TGTGOGTOGC CTGCCACCCT TGCTTCCAGA CCTATGATGC GGACCTCOGG 1860 

GAQGAOGCCC TGGGCTTTGG TAGACTC06C AATGGCACCO CCAQCCT G TG GTCAGGGCCT 1920 

GGGCTGGAGG ACOGTGGCCT GQCCTCCC66 ATOCTAGATG CAAAGA6TAA GATTGAOCAO 1980 

ATCXX3AGCAG TTCTCAGCAG CCCCGCAGTC AC3VGAGCAGG AGGTGGCTCA GGTGGCCAGT 2040 

GCCATCCTCT CCCTCAGGCG AACTCTCCAG GGCCTGCAGC TG6ATCTGCC CCTGGAGGAG 2100 

GAGA0GTT6T CGCTTG06AG AGACCT66A6 AGTCTTGACA GAAGCTTCAA TGGTCTCCTT 2160 

ACTATGTATC AGAGGAAGAG GGA6CAGTTT GAAAAAATAA GCAGTGCTGA TCCTTCAGGA 2220 

OOCTTCOGGA T6CTGAGCAC AQCCTA06A6 CAGTCAGCCC AGGCTGCTCA OCAOGTCTOC 2280 

GACAGGTCGC GCCTTTTGGA CCAGCTCAOG GACAGGOGGA GAGAGGCAGA GAG6CTGGTG 2340 

OGGCAGGCGG GAGGAGGAGG AG6CACC6GC AGCCCCAAOC TTGTGGCCCT GA6GCTGGA6 2400 

ATGTCTT0G7 T60CTGACCT GACACCCACC TTCAACAA6C TCTOTGGCAA CTCCAGGCAG 2460 

AtGGCTTGCA CCOCAATAXC ATGGCCTGCJT GAOCXAIGTC C0CAA6ACAA TGGCACAGOC 2520 

TOTGQCTCCC GCTGCAG6GG 'Atfi'CCl"rCCC AGGGCC6GTG GG60CTTCTT GATGG06GGG 2580 

CAGGTGGCTG AGCAGCTGCG GGGCTTCAAT GCCCA6CTCC AGOGGACCAG GCAGATGATT 2640 

AGGGCAGCCG AGGAATCTGC CTCACAGATT CAATCCAGTG CCCAGOGCTT GGAGACCCAG 2700 

GIGA6CGCCA GCGGCTCCCA GATGGAGGAA 6ATGTCA6AC GCACAOGGCT CCTAATCCAG 2760 

CA6GTCC66G ACTTCCTAAC AGAOOCOQAC ACTGAT6CA0 OCACTATCCA GGAGGTCAGC 2820 

' GAGGC06TGC TGGCCCTGT6 6CTGCCCACA GACTCAGCTA CTGTTCTGCA GAA6AT6AAT 2880 

GAGATCCAGG CCATTGCAGC CAGGCTCCCC AACGTGGACT TGGTGCTGTC CCAGACCAAG 2940 

CAGQACATTG CGOGTCCCCG COGGTTGCAG GCTGAGGCTG AGGAAGCCAG GAGCCGAGCC 3000 

CATGCAGTGG A6GGCCAGGT GGAAGATGTG GTT6GGAACC TGCGGCAG6G GACAGTGGCA 3060 

CTGCAGGAAG CTCAGGACAC CATGCAA8GC AOCAGCCGCT C0CTT0G6CT TAT0CAG6AC 3120 

AGGGTTGCTG AGGTTCAQCA GGTACT6CG6 CCAGCAGAAA AGCTGGT6AC AAGCATGACC 3180 

AA6CAGCTGG GTGA C T l 'CT G GACACGGATG GAGGA6CTCC OCCACCAAGC COGGCAGCAG 3240 

GGQGCAGAGG CAGTCCAGGC CCAGCAGCTT G06GAAGGTG CCAQCGAGCA GGCATTGAGT 3300 

6CCCAAGAG6 GATTTGAGAG AATAAAACAA AAGTATGCTG AGTTGAAGGA C06GITGGGT 3360 
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CAOAGTrCCft TGCTGG6TGA GCA666TG0C OGGIVTCCA62V GTGTGAA6AC AGA6QCACA0 3420 

GAGCTGTrTG GGGAOICCAT G6AGATGATG O^CAGGATGA AAGACATG6A 6TTG6AGCTG 3480 

CTGOQGGGCA GCCAGGCCAT CATGCTGOGC TOGGOSGACC TGACAGOACT GGAGAAGCGT 3S40 

GTGGAGCAGA TCOGTGACCA CATCAATX3GG OGOGTGCTCT ACTATGCCAC CTGCAAGTGA 3600 

TGCTACAGCT TOCAGOCOGT TGCCCC3W:TC ATC TGCO GCC mwrm tj GTTGGGGGCA 36S0 

GATXGGGTTG GAATGCTTTC CATCTCCAGG AGACTTTCAT GCAGOCTAAA 6TACAGCCTG 3720 

GACCACCCCT G G T G T G T A OC TAGTAAQATT ACCCT6AGCT GCA6CTGAGC CTGAGCCAAT 37B0 

GGGACAGTTA CACTTGACAG ACAAAGATGG TQGAGATTGG CATGCCATTG AAACTAAGAG 3840 

CTCTCAAGTC AAGGAA6CTG GGCTGGGCAG TATCCCCCGC CTTTAGTTCT CCACTGGGGA 3900 

GQAATCCTGG ACCAAGCACA AAAACTTAAC AAAA6TGAT6 TAAAAATGAA AAGCCAAATA 3960 
AAAATCZTTG G 

Seq ID NOt 20 Protein sequence t 
Protein Accession #i NF_000219 

1 11 21 31 41 51 

I I I I I I 

MRPFPLLCPA I»PGIiLHAQQA CSRGACYPPV GDLLVGRTRF LRASSTOGLT KPETYCTQYG 60 

EHQMKCCKCD SRQFHNYYSH RVEKVASSSG PMRHWQSQND VNPVSLQLDL DRRFQLQBVM 120 

MEFXySPMPAG HLIER5S0FG KTNRVYQYLA ADCTSTFPRV RQGRPQSWQD VR0Q5LPQRP 180 

NARUIGGKVQ LNLMDLVSOI PATQSQKIQE VGEITHLRVH FTRLAFVPQR GYHPPSAYYA 240 

VSQLRIjQGSC FCHGHADRCA FKPGASAGPS TAVQVHDVCV CQHHTAGPNC ERCAPFYNNR 300 

PWRPAE6QDA HECQRCDOnS HSETOIFDPA VFAASQGAYG GVCDNCROHT EX3XNCERCQL 360 

HYFRNRRPGA 6IQETCISCE a>FDGAVPGA PCDPVTGQCV CKEHVQGERC OIjCICPGFTGIi 420 

TYANPQGCHR CDCHILGSRR OKPCDEBSGR CLCLPZIVVGP KCDQCAFYHW KLASGQGCEP 460 

CACDPBNSPQ PTVQFVBBAV PCREGFGGX/4 CSAAAIRQCP 23R7Y6DVATG CRACDCDFRG 540 

TB6PGCDKAS GRCLCRPGLT GPRCDQCQRG YCNRYPVCVA CHPCFOTYDA DLREQALRFG 600 

RLSMATASliH SGPGLEDBGL ASRILDAKSK lEQIRAVLSS PAVTEQSVAQ VASAIIiSLRR 660 

TLQGXiQU)IiP IiEEETLSLPR DLESLDRSFN GLL7MYQRKR EQFEKISSAD PSGAFRMLST 720 

AYEQSAQAAQ QVSDSSRLLO QLRDSRREAE RLVRQAGGGO GTGSPKLVAL RLEKSSLPDL 780 

TPTFKKLGGN SRQHACTPIS CPGELCPQDN 6TACGSRCRQ VLPRAGGAFL MAGOVAEOLR 840 

GFNAQLQRTR QMIRAAEESA SQIQSSAQRL ETQVSASRSQ MEEDVRRTRL LIQQVRDFLT 900 

DPDTDAATIQ BVSEAVLALH LPTDSATVLQ KMNBIQAIAA RLPNVDLVIiS QTKQDZARAR 960 

RIOAEAEEAR SRAHAVEGQV EDWGHIAQG TVALQBAQDT MQGTSRSLRL IQDRVABVQQ 1020 

VLRPAEKLVT SMTKQLGDFW TRMEBLRHQA RQQGABAVQA QQIAB6ASEQ ALSAQBQFBR 1080 

IKQKYAELKD RLGQ8SMLGE C2GARIQSVKT EAEELFGEIM EKHDRMXDHB XaBLLRGSQAI 1140 
KLRSADLTGL EKRVEQIRDH INGRVLYYAT CK 

Seq ID NO: 21 DKA sequence 
Nucleic Acid Accession # : 104^003722 
Coding sequence! 145'- 1491 

1 11 21 31 41 51 

11)111 

TCXnrrGATAT CAAAGACAGT TGAAGGAAAT GAATTTTGAA ACTTCACGGT GTGCCaCCCT 60 

ACAGTACTGC CCTGACCCTT ACATCCAGCG TTTOGTAGAA ACCCAGCTCA TTTCTCrTGG 120 

AAASAAAGTT ATTACCXSATC CACCAT6TCC CAGAGCACAC AGACAAATGA ATTCCTCAGT 180 

GCAQA66TTT TCCA8CATAT CTOGGATTTT CTG6AACAGC CTAT A TGTTC AGTTCA6G0C 240 

ATTGACTTOA ACTTTGTGGA TGAAOCATCA 6AAGAT6GT6 CGACAAACAA GATT8A6ATT 300 

AGCATGGACT GTATCCGCAT GCAGGACTCG GACCTGAGTG ACCCCATGTG GCCACAGTAC 360 

ACGAACCTGG GGCTCCTGAA CAGCATGGAC CAGCAGATTC AGAAOSGCTC CTCGTCCACC 420 

AGTCCCTATA ACACAGAOCA OGOSCAGAAC AGCGTCAOGQ OSCCCTOGCC CTACGCACAG 480 

CX:CA6CTGCA CCTTOGATOC TCTCTCTCGA TCACOOGOCA TCCCCTCCAA CACCGACTAC 540 

CCA86CCCQC ACAOTTTOGA GGTGTCCTTC CAGCA6TCGA GCACC6CCAA GT06GCX3V0C 600 

TGGACX3TATT CCACTGAACT GAAGAAACTC TACTGCOU^A TTGCAAAGAC ATGCCCCATC 660 

CAGATCAAOG TGATGACXXXT ACCTCCTCAG GGAGCTGTTA TCCGOGCCAT GCCTOTCTAC 720 

AAAAAAGCTG AGCAGGTCAC GGAGGTGGT6 AA60QGTGCC CCAACCATGA GCTGAGCOGT 780 

GAATTCAA06 AG6GACAGAT T6C00CTCCT AGTCATTTQA TTOGAGTAGA GOGGAACAGC 840 

CAT6CCCAGT ATGTAGAAGA TCCCATCACA GGAAGACAGA GTGTGCTGGT ACCTTATGAG 900 

CCACCCCAGO TTGOCACTGA ATTCAOQACA GTCTTGTACA ATTTCATGTC TAACAGC3VGT 960 

TGTGTTGGAG GGATGAACOQ CCGTCCAATT TTAATCATTG TTACTCTGGA AACCAQAGAT 1020 

GG6CAA6TCC TGGGCCGACG CTGCTTTGAG GCCOGGATCT GT6CTTGCCC AGGAAGAGAC 1080 

AGQAAGGOOS ATGAAGATA6 CATCAGAAAG GAOCAAOTTT O9GAC310TAC AAAGAAOSGT 1140 

GATGGTAOGA AG06CC06TT T08TCA6AAC ACACATGGTA TCCAGAT6AC ATCCATCAA6 1200 

AAAOSAAGAT COCCAGATGA TGAACTGTTA TACTTACX3VG TGAGGGGCOS TGAGACTTAT 1260 

GAAATGCTGT TGAAGATCAA AGAGTCCCTG 6AACTCATGC AGTACCTTCC TCAGCACACA 1320 

ATT6AAAC6T ACAGGCAACA GCAACAGCAG CAGCACCA6C ACTTACT7CA QAAACATCTC 1380 

CTTTCA60CT OCTTCAGGAA T6A6CTTGT6 GAGG0CC6GA QAGAAACTOC AAAACAATCT 1440 

GA06TCTTCT TTA6ACATTC CAA60CCCCA AACOQATCAO TGTAC CCATA GAOC CCTATC 1500 

TCTATATTTT AAGTGTGTOT GTTGTATTTC CATGTGTATA TGTGAGTGTG TGTGTGTGTA 1560 

TGTGTGTGC3G TGTGTATCTA OCCCTCATAA ACAGGACTTG AAGACACTTT GGCTCAISUa 1620 

CCXAACTQCT CAAAGGCACA AA6CCACTAG TGAGAGAATC TTTTGAAGG6 ACTCAAAO CT 1680 

TTACAAGAAA GGAT6TTTTC TGCAGATTTT GTATOCTfAG AOGQOGCATT QGTQOOtGAO 1740 

GAACXACTGT G T i mxr i irr GAGCTTTCTO IWm ' C CTG GGAG0QA0G6 GTCAGGTG6G 1800 

GAAAGGGGCA TTAAQATGTT TATTOQAACC CTTTTCrGTC TTCTTCTGTT GTTTTTCTAA 1860 

AATTCACAGQ GAAGCTTTTG AGCAGQTCTC AAACTTAAGA TGTCTTTTTA AGAAAAGGAG 1920 

AAAAAAGTTG TTATrGTCTG TGCATAAGTA AGTTGTAGGT GACTGAGAGA CTCAGTCAOA 1980 

CCCTTTTAAT 6CTGGTCATG TAATAATATT GCAAGTAGIA AGAAAOSAAG GXGTGAAGT6 2040 

TACT G CTGG G CAGGGAGGT6 ATCATTACCA AAAGTAATCA AC7TTGTGG6 TGGAGAGTTC 2100 

TTTGTGAGAA CTT6CATTAT TTGTGTCCTC CCCTCATGTG TAGGTAGAAC ATTTCTTAAT 2160 

GCTGTGTACC TGCCTCTGCC ACTGTATGTT GGCATCTGTT ATGCTAAAGT TTTTCTTGTA 2220 

CATGAAACCC TGGAAGACCT ACTACAAAAA AACTGTTGTT TG6CCCCCAT AGCAOGTGAA 2280 

CTCATTTTGT GCTTTTAATA GAAAGACAAA TCGAOCCCAO TAATATT60C CTTAOSXAGT 2340 

TGTT7ACCA7 TATTOAAGC TCAAAATAGA ATTTQAAGCC CTCTCACAAA ATCT6T6ATT 2400 

AATTT6CTTA ATTAQAGCTT CTATCCCTCA AOCCTACCTA CCATAAAACC AGCCATATTA 2460 

CTGATACTGT TCAGT6CATT TAGCCAG6AG ACTTAOGTTT T6AGTAAGTG AGATCCAAGC 2520 

AGACGTGTIA AAATCA6CAC TCCTGGACTG 6AAATTAAA6 ATTGAAA6GG TA6ACTACTT 2580 
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iitrmTiYr tactcaaaag tttagwsrat crci' G 'm' C T TTocarmA aaaacaiatt 2640 

TTAAGATAAT AGCATAAAGA CTTTAAAMIT tfil'CCl'OCCC TCCJ^TCTTCC CAC3U3CCA6T 2700 
CAOCAGCACT GTATTTTCXG TCACCAA6AC AATGATTTCT TGrTATTGAG OCIGTTGCTT 2760 
TTGTGGATCT GTQATTTTAA TTTTCAATAA ACTTTTGCAT CTTQ6TTTAA AASAAA 

Seq ZD HO: 22 Protein sequences 
Protein Accession #s HP_003713 

1 11 21 31 41 SI 

I i I I I I 

MSQSTCnMEF LSPSVFQBZW DFLEQPZCSV QPZDLHFVDB PSEDGATSKZ BZSMDCZSHQ 60 

DSDLSDFHSf? QYTNL6LINS NDQQXQKGSS STSPYBTDBA OKSVTAPSPY AQPSSTFDAli 120 

SPSPMPSNT DYPGPHSFDV SPQQSSTAKS AWTYSTELK KLYOQIAKTC PIQIKVMTPP 180 

PQGAVIRAMP VYKKAEHVTE WKRCPNEEL SRBFNEGQ2A PPSHLIRVEG KSHAQWEDP 240 

ZT6RQSVLVP YBPPQVGTEF TTVLYNPHOf SSCVGGKmiR PILIIVTLBT RDGQVLGRAC 300 

FGARICACPG RDRKADEDSI RKQQVSDSTK NGDGTKRPFR QNTHGIQMTS IKKRRSroDE 360 

LLYIiFVaGRS TYafLUUKB SXiELMQYLPQ RTIBTYRQQQ QQQHQBX.LQR HLLSACFRNB 420 
liVSPSRBTPX QSfiVFFRHSX PPNRSVYP 



Seq ID NO: 23 ONA sequence 

Nucleic Acid Accession ft: NM_001944.1 

Coding sequence: 84-3083 

1 11 21 31 41 51 

i i { ( I I 

rrrrCTTAGA CATTAACTGC AGAOGGCTGG CAGGATAGAA GCAGOGGCTC ACTTGGACTT 60 

TTTCACCAGG GAAATCAGAG ACAATGATGG GGCTCTTCCC CAGAACTACA GGGGCTCTGQ 120 

CCATCTTCGT GGTGGTCATA TTGG?rTCATG GAGAATTGOG AATAGAGACT AAAGGTCAAT 180 

ATGATGAAGA AGAGATGACT ATGCAACAAG CTAAAAGAA6 GCAAAAAC6T 6AAT0GGTGA 240 

AATTTGCCAA ACOCTGCAGA GAA6GA6AAG ATAACTCAAA AAGAAACCCA ATTGCCAAGA 300 

TTACTTCAGA TTACXS^CA ACXCAGAAAA TCACCTACCS AATCTCTGGA 6TGGGAATCG 360 

ATCAGC08CC rTTTGOAATC TTT6TTGTTG ACAAAAAC3U: TGGAGATATT AACATAACA6 420 

CTATAGTOGA CGGGGAC6AA ACTCCAAGCT TCCTGATCAC ATGTOGGGCT CTAAATGCCC 480 

AAGGACTAGA T6TAGAGAAA CXaCTTATAC TAACGGTTAA AATTTTGGAT ATTAATGATA 540 

ATCCTCCAGT ATTTTCACAA CAAATTTTCA TGGGTGAAAT TGAAGAAAAT AGTGCCTCAA 600 

ACTCACTG6T GAT6ATACTA AATGCCACAG ATGCAGATGA ACCAAACCAC TT6AATTCTA 660 

AAATT6CCTT CAAAATT6TC TCTCAGCSAAC GAOCAGGCAC ACCCAT6TTC CTGCTAAGCA 720 

6AAACACT60 GGAAGTCOGT ACTTT8ACCA ATTCTCTTGA CCGAGAGCAA GCTAGCAGCT 780 

ATCX3TCTGGT TGTGAGTGGT GCAGACAAAO ATGGA6AAGG ACTATCAACT CSkATGTGAAT 840 

GTAATATTAA AGTGAAAGAT GTCAAOGATA ACTTCCCAAT GTTTAGAGAC TCTCAGTATT 900 

CAGCACGTAT TGAAGAAAAT ATTTTAAGTT CTGAATTACT TG6ATTTCAA GTAACAGATT 960 

TGGATGAAGA GTACACA6AT AATTG6CTTG CA6TATATTT CTTT A CCTCT G6GAAT6AAG 1020 

6AAATTGGTT TQAAATACAA ACTGATCC7A 6AACTAATGA AGGCATCCTG AAAGT6GTGA 1080 

AGGCTCTAGA TTATGAACAA CTACAAAGCG TGAAACTTAG TATTGCTGTC AAAAACAAAG 1140 

CTGAATTTCA CCAATCAGTT ATCTCTCX5AT ACCX5AGTTCA GTCAACCCCA GTCACAATTC 1200 

AGGTAATAAA TGTAAGAGAA GGAATT6CAT TCOGTOCTGC TTGCAAGACA TTTACTGTGC 1260 

AAAAAG6CAT AAGTAOCAAA AA A TTO GT OG ATTATATGCT GGGAACATAT CAAG0CAT06 1320 

ATGAGGACAC TAACAAAOCT GCCTCAAATG TCAAATATGT CATGGGAOGT AA0GAT6GT6 1380 

GATACCTAAT GATTGATTCA AAAACTGCTG AAATCAAATT TGTCAAAAAT ATGAACCGAG 1440 

ATTCTACTTT CATAGTTAAC AAAACAATCA CAGCTGAGGT TCTGGCCATA GATGAATACA 1500 

CGGGTAAAAC TTCTACAGGC A06GTATATG TTAGAGTACC C6ATTTCAAT GACAATTGTC 1560 

CAACAGCTGT CCT06AAAAA GATGCAOTTT OCAGTTCTTC ACCTTC06T6 GTT6TCTCCG 1620 

CTAGAACACT GAATAATAGA TACACTGOOC CCTATACATT T6CACTGGAA GATCAACCTQ 1680 

TAAAGTTGCC TGOOGTATGO AGTATCACAA CCCTCAATGC TACCTCGGCC CTCCTCAGAG 1740 

CCCAGGAACA GATAOCTCCT GGAGTATACC ACATCTCCCT GGTACTTACA GACAGTCAGA 1800 

ACAATCG6T6 TGAGATGCX31 CGCAGCTTQA CACTG6AAGT CTGTCAGTGT QACAACA660 1860 

GCATCTGIG6 AACTTCTTAC CCAACCAC3\A GCCCIGQQAC CAGGTASGGC AOGCOQCACT 1920 

CAGGGAGGCT GG6GCCTGCC 6CCATCX3QGC TGCTSCTCCT TG6TCTCCTG CT0CTGCT6T 1980 

TGGCCCCCCT TCTGCTGTTQ ACCTGTGACT GTGGGGC31GG TTCTACTGGG GGAGTGACAG 2040 

GTGQTTTTAT CCCAOTTCXT GATGGCTCAG AAGGAACAAT TCATCAGTGG GGAATTGAAG 2100 

GAGCCGATCC TGAAGACAAG GAAATCACAA ATATTTGT6T GCCTCCTGTA ACAGCCAATG 2160 

GAGCOGATTT CATGGAAA6T TCT6AAGTTT GTACAAATAC GTATGOCAGA G8C31CA6CGG 2220 

TG6AAGGCAC TTCAGGAATG GAAATGACCA CTAAQCTTGG AGCAGCCACT GAATCTG6AG 2280 

GTGCTGCAOO CTTTGCAACA GGGACAGTQT CAGGAGCTGC TTCAGQATTC G6A6CA6CCA 2340 

CTG6AGTT60 CATCTGTTCC TCAGGGCAGT CTGGAACCAT GAGAACAAGG CATTCCACTG 2400 

QAGGAACCAA TAA06ACTAC GCTGATGGGG C6ATAA6CAT GAATTTTCT6 GACTCCXACT 2460 

TTTCTCAGAA AGCATTT6CC TGIGOCSGAOG AA GAC3SATS G CCA9GAA6CA AATOACTO CT 2520 

TGTT6ATCTA TGATAAT6AA GG06CAQATG CCACTGGTTC TCCTGTGGGC TG0GTG6GTT 2580 

GTTGCAGTTT TATTGCTGAT GACCTGGAT6 ACAOCTTCTT GGACTCACTT 6GACCCAAAT 2640 

TTAAAAAACT TGCASAGATA AGCCTTGGTO T7CA3t303X3A A6GCAAAGAA GTTCA6CX:AC 2700 

CCTCTAAAGA CAGOQGTTAT GGGATTGAAT CCTGTGGCCA TCCCATAGAA 6TCCACCAGA 2760 

CAGGATTTGT TAAGT6CCAG ACTTT6TCA0 GAA6TCAAGG AGCTTCTGCT TTGTG06CCT 2820 

CTGGGTCTGT CCAGCCAGCT GTTTCCATOC CTGAOCCTCT GCAGCATGGT AACTATTTAG 2880 

TAAOGGAGAC TTACTCGGCT TCTGGTTCCC TOGTOOUIOC TTCCACTGCA GGCTTTGATC 2940 

CACTTCTCAC ACAAAATGTG ATAGTGACAG AAAGGCTGAT CTGTCCCATT TCCAGTGTTC 3000 

CTGGCAACCT AGCTGGCCCA ACGCAGCTAC GAG6GTCACA TACTATGCTC TGTACAGAGG 3060 

ATCCTTGCTC CGQTCTAATA TGACX31GAAT GA6CTQGAAT ACCACACTGA OCAAATCTGG 3120 

ATCTTTGGAC TAAAGTATTC AAAATAGCAT AGCAAAGCTC ACTGTATTOQ GCTAATAATT 3180 

TGGCACITAT TAGCTTCTCT CATAAACTGA TCAOGATTAT AAATTAAATG TTTGGGTTCA 3240 

TACCCCAAAA GCAATATGTT GTCACTCCTA ATTCTCAAGT ACTATTCAAA TTOTAGTAAA 3300 
TCTTAAAGTT TTTCAAAACC CTAAAATCAT ATTOGC 

Seq ID NO: 24 Protein sequence t 
Protein Accession #i 1IP_001935.1 

1 11 21 31 41 51 
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MMGLFFRTTG AIAIFVWIL VHGELRIETK GQYDEESfTM QQAKRRQKRB NVKFAKPCRE 60 

GBDNSKHNPr AKITSDYQAT QKITYRISGV GIDQPPFGIP WDKNTGDIN ITAIVDRBBT 120 

PSFLITCRAL NAQGUTVEKP LIbTVKIIiDI MDKPPVPSQQ IFKGSISEHS ASHSI«VMILN 180 

ATDADEPHHL NSKIAFKIVS QEPAGTPMFL LSRHTGBVBT LTKSLDRBQA SSYRLWSGA 240 

0KDGB6LS7Q CEOrilCVKDV NDNFFHFRDS QY8ARIEEm LSSELLRFQV TDhDESTSW 300 

HLAVyPFTSG NBSNWFEIQT DPRTNEGIUC WKAIJ)yEQL QSVKLSIAVK NKABFEQSVI 360 

SRTRVQSTPV TIQVIKVREG lAFRPASKTP TVQKGISSKK LVDYII/3TYQ AIDEDTNKAA 420 

SNVmrVMGRN DGGYLMIDSK TAEIKFVKNM KRDSTPIVNK TITABVIAID BYTGKTSTCT 480 

VYVRVFDF13D NCFTAVLEXD AVCSSSPSW VSARTLKBRY TGPYTFALSD QPVKLPAVHS 540 

ITTLNATSAL LRAQBQIPPG WBISLVLTD SQSOmCENFR SLTLEVOQCD NRGZOSTSYP 600 

TTSPGTRYGR PKSGRL6PAA IGLLLLGUiI. LLLAFLLLLT CD06AGSTGG VTG6FIFVFD 660 

GSB6TIHQW0 lEGAHPEDKB ITNICVPPVT ANGADPMBSS BVCTOTYARG TAVBGTSOIE 720 

MTTKLGAATB SGGAAGFATG TVSGAASGF6 AATGVGICSS GQSGTMRTRH STGGTNKDYA 780 

DCSMSKSThD SYFSQKAFAC AEEZ3DGQEAH DCLLZYDNEG ADATGSFVGS VGCCSFIADD 840 

LDDSFLDSLG PKFRKLAEZS L6VDGEGKEV QPPSKDSGY6 IBSOGHPXEV QQT6FVK0QT 900 

LSGSQ6ASAL SASGSVQPAV SIPDPLQH6N YLVTETYSAS GSLVQPStAG FDPXiXiTQNVI 960 
VTERVICPIS SVPGHLAGPT QLRGSaTMLC TBDPCSRLI 

Seg IS NO: 25 DNA sequence 

Kucleic Acid Accession #t Eos sequence 

Coding sequence: 56 1642 

1 11 21 31 41 51 

I I I I i 1 

AGTATCCCAG GAGGAGCAAG TGGCACGTCT TOCGACCTAG GCTGCCCCTO CCGTCATGTC 60 

GCAAGGGATC CTTTCTCCGC CAGOGQGCTT QCTGTCOGAT GA0GATGTCX5 TAGTTTCTCC 120 

CATGTTTQAO TCCACAOCTG CAGATTTGGG GTCT6TGGTA CGCAAGAACX TGCTATCAGA 180 

CTOCTCTGTC OTCTCTAOCT C0CTA6AG(a CAAGCAGCAG GTTCCATCTG AGGACAGTAT 240 

GGA6AAGG7G AAAGTATACT TGAGGGTTAG GCCCTT6TTA CCTTCAGAGT TGGAAOQACA 300 

GGAAGATCAG GGTTGTGTCC GTATTGAGAA TGTGGAGACC CTTGTTCTAC AAGCACOCAA 360 

GGACTCTTTT GCCCTTGAAGA GCAATGAAOG GGGAATTGGC CAA6CCACAC ACAGGTTCAC 420 

CTTTTCCCAa ATCTTTOGGC CAGAAOTGGO ACAGGCATCC TTCTTCAAOC TAACT6TGAA 4B0 

GQAGATGQTA AA06ATGTAC TCAAAG6GCA GAACTG6CTC ATCTATACAT ATGGA6TCAC 540 

TAACTCAGGO AAAACCCACA 06ATTCAAGG TACCATCAA6 GATGGAGGGA TTCTGCCCOG 600 

GTCCCTGGCG CTGATCTTCA ATAGCCTCCA AGGCCAACTT CATCCAACAC CTGATCTGAA 660 

GCCCTTGCTC TCCAATGAGG TAATCTGGCT AGACAGCAAG CAGATCOGAC AGGAGGAAAT 720 

GAAGAAGCTG TCCCTGCTAA ATGGAGGCCT CCAAGAGQAG OAGCIGTCCA CTTCCTTGAA 780 

GAGQAGTGTC TACATG6AAA GTOXSATAGO TACCAGC3UX AGCTTOSACA 6TGGCATT6C 840 

TGGGCTCTCT TCTATCAGTC AGTGTACCAG CAGTAGCCAG CTGGATGAAA CAAGTCATOS 900 

ATGGGCACAG CCAGACACTG CCCCACTACC TGTCCOGGCA AACATTCQCT TCTCCATCTG 960 

GATCTCATTC TTT6AGATCT ACAACGAACX GCTTTATGAC CTATTAGAAC OGCCTAGCCA 1020 

ACAGOGCAAG AGGCAGACTT TGOOGCTATO 08AGGATCAA AAVGGCAATC OCTATSIGAA 1080 

AGATCTCAAC TGGATTCATQ T8CAA6AT6C TGAG6AG6CC TG6AAGCTCC TAAAAGTGSG 1140 

TCGTAAGAAC CAGAGCTTTG CCAGCACCCA CCTCAACCAG AACTCCAGCC GCAGTCACAG 1200 

CATCTTCTCA ATCAGGATCC TACACCTTCA GGGGGAAGGA GATATAGTCC CCAAGATCAO 1260 

OGAGCTGTCA CTCTGTGATC TGGCTQGCTC AGAGCXSCTGC AAAGATCAGA AGAGTGGTQA 1320 

AOOGTTGAAG GAAGCAGGAA ACATTAACAC CTCTCTACAC AOOCTGGOCC GCTGTATTGC 1380 

TGCCCTTGGT CAAAACCAGC AGAACOGGTC AAAGCAQAAC CTGGTTCCCT TCCGTGACAG 1440 

CAAGTTGACT CGAGTGTTCC AAGGTTTCTT CACAGGC06A GGCC3GTTCCT GCATGATTGT 1500 

CAATGTGAAT CCCTGTGCAT CTACCTATGA TGAAACTCTT CATGTGGCCA AGTTCTCAGC 1560 

CATTGCTA6C CAOGTGACTT GTCCAT6CCC CACCTATGCA ACTX9GGATTC CCATOCCTYSC 1620 

ACT06TTCAT CAAGGAACAT AGTCTTCAGG TATCCCCCAO CTTAGAGAAA GGGGCTAAGO 1680 

CAGACACAGG CCTTGATOAT GATATTGAAA ATGAAGCTGA CATCTGCATG TATQ6CAAA0 1740 

AGGAGCTCCT ACAAGTTGTG GAAGCCATQA AGACACTGCT TTTGAAGGAA CGACAGGAAA 1800 

AGCTACAGCT GGAGATGOVT CTCOSAGATG AAATTTGCAA TGAGATGGTA GAACAGATGC 1860 

AACAGOGQGA ACAGTGGTGC A6TGAACATT TGGACACCCA AAAGGAACTA TTGGAG6AAA 1920 

TCrrATGAAGA AAAACTAAAT ATCCTCAA08 AGTCACTGAC AAGTTTTTAC CAAQAA6AGA 1980 

TTCAG6A6C6 G6ATQAAAA6 ATTGAAGAGC TAGAAGCTCT CTTGCAOGAA GCCAOACAAC 2040 

AGTCAGTGGC CCATCAGCAA TCAGGGTCTG AATTGGCCCT AOGGOGOTCA CAAAGGTTGG 2100 

CAOCTTCTGC CTCCACCCAQ CAGCTTCAGG AGGTTAAAGC TAAATTACAG CAGTGCAAAG 2160 

CAGA6CTAAA CTCTACCACT GAAGAGTTGC ATAAGTATCA 6AAAATGTTA GAACCACCAC 2220 

OCTCAGCCAA GCCCTTCACC ATTOATGTGG ACAAGAAGTT AGAAGAGOGC CASAAOAATA 2280 

TAA66CTGTT 6G0GACA6AG CTTCAGAAAC TTQGTGAGTC TCTOCAATCA 6CAGAGA6AG 2340 

CTTGTTGCCA CAGCACTGGG OCAGGAAAAC TTGGTCAAGC CTTGACCACT TGTGATGACA 2400 

TCTTAATCAA ACAGGACCAG ACTCTGGCTG AACT6CAGAA CAACATGGTG CTAGTGAAAC 2460 

TGGACCTT06 GAAGAAGOCA GCATGTATTG CTGA6CAGTA TCATACTGTG TTGAAACTCC 2520 

AA06CCAG6T TTCTCCCAAA AAGG6GCTTG 6TACCAACCA QGAAAATCAG CAAOCAAACC 2580 

AACAACCACC AOGGAAGAAA GCATTCCTTC GAAATTTACT TCCXICGAACA CCAA CCTGO C 2640 

AAAGCTCAAC AGACTGCAGC CCTTATGOCX: OGATCCTAOS CTCAOGGOGT TCCCCTTTAC 2700 

TCAAATCTGG GCCTTTTGGC AAAAAGTACT AAGGCTGTGG GGAAAGAGAA GA6CAGTCAT 2760 

6GCCCTGAG0 TGGGTCAGCT ACTCTCCTGA AGAAATAGGT CTCTTTTATG CTTTACCATA 2820 

TATCAGGAAT TATATCX3U3Q AI6CAAXACT CAGACACTA Q CTTTTTTCTC ACTnTGTAT 2660 

1ATAA0CA0C TATGTAATCT CAT6TTGTTQ TTTTTTTTTA TTTACTTAXA T6A3TTCTAT 2940 

GCACACAAAA ACA6TTATAT TAAAGATATT ATTGTTCACA TTTTTTATTO AATTOCAAAT 3000 
GTA6CAAAAT CATTAAAACA AATTAXAAAA GGQACAOAAA AA 

Seq ID RO: 26 Protein sequence: 
Protein Accession fi; Eos sequence 

1 11 21 31 41 51 

1 t I 1 I I 

HSQGILSPPA GLLSDDDVW SPMFESTAAD LGSWRXKLL SDCSWSTSL EDKQQfVPSED 60 

SMEEVXVYZA VSPU.PSELS R0EDQ6CVRI ETTVETLVLQA PKDSPALKStf ERGIGQATHR 120 

PTPSQIPGPE VGQASPFNLT VKEMVKDVLK GQSWLIYTYG VTNSGKTHTI QGTIKDGGIIj 180 

PHSLALIFNS LQGQLEPTPD LKPLLSNEVI HLDSKQIKQB EMKKLSLLNG GLQEEELSTS 240 

LKRSVYIBSR IGTSTSFDSG ZAGLSSISQC TSSSQLDETS HRHAQPDTAP LFVPANIRFS 300 
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ZWISFFSIYN ELLYDLLBPP SQQRKRQTUl LCEDQNtaiFT VKDLKHIHVO SAEBAHKLLK 360 

VGSKNQSFAS TBUIQITSSRS BSZFSISILE LQGBGDZVPK ISSI.SLCDIA OSERCKDQKS 420 

GERLXEAGNX NTSI£TL6HC lAALRQHQOSr RSKQNLVPFR DSKbTRVFQQ FF7GRGRSCM 480 
ZVNVNPCftST YDBTLHVAXF SAZASQVTC31 CPTYATGZFX PALVBQGT 

Seq ZD HO: 27 DKA sequence 

Nucleic Acid Accession tft Eos sequence 

Coding sequence t 13-1424 

1 11 21 31 41 51 

I I I I I i 

TAGAASnTA CAATGAAQTT TCTTCTAATA CTQCTCCTGC AG60CACTGC TTCTGQAGCT 60 

CTTCCCCTGA ACAGCTCTAC AAGCCTGGAA AAAAATAATG TGCTATTTGQ TGAAACATAC 120 

TTAGAAAAAT TTTATGCCXTT TGAOATAAAC AAACTTCCA6 TCACAAAAAT GAAATATAGT 180 

G6AAACTTAA T6AA3GAAAA AATCCAAGAA AT6CAGCACT TCTTGGCTCT GAAASTGACC 240 

Q03GAACT00 ACftCATCTAC OCTGGAGATQ AT0CA08CAC CTGGATGTGO AGTCCCCXSAT 300 

GTOCATCATT TCAGG6AAAT 6CCAGG6GGG CCCGTATGGA G6AAACATTA TATCAOCTAC 360 

AGAATCAATA ATTACACACC TGACAT6AAC OGTGAGGATO TTQACTACGC AATC06GAAA 420 

GCTTTCCAAG TATGC5AGTAA TGTTACCCCC TTGAAATTCA 6CAAGATTAA CACAGGCATC 480 

GCTGACATTT TGGTGGTTTT TGCCOGTGGA GCTCATGGAG ACTTCCATGC TTTTGATGGC 540 

AAAG6TGGAA T0CTA6CCCA TGCTTTTGGA CCT6QATCTG 6CATT06A0G GGAT6CACAT 600 

TT0SAT6A60 AOBAATTCTO GACTACACAT TCAGGA66CA CAAACTTGTT CCTCACT6CT 660 

GTTCAOGAGA TTGGCCATTC CTTAGGTCTT GGCCATTCTA GTGATCCAAA GGCOGTAATG 720 

TTCCCCACCT ACAAATATGT TGACATCAAC ACATTTCGCC TCTCTGCTGA TGACATAOGT 780 

GGCATTCAGT COCTGTATXSG AGACCCAAAA GAGAACCAAC GCTTGOCAAA TCCTGACAAT 840 

TCAGAACCAG CTCTCTGTGA CCCCAATTTG AGTTTTGArO CTGTCACTAC OGTGGGAAAT 900 

AAGATCTTTT TCTTCAAA6A CAGGTTCTTC TGGCTGAAGG TTTCTGAGAG ACCAAAGACC 960 

AGTGTTAATT TAATTTCTTC CTTATGGtXA ACCTTGCCAT CTX3GCATTGA AGCTGCTTAT 1020 

GAAATTGAAG CCAGAAATCA AGTTTTTCTT TTTAAAGATG ACAAATACTG GTTAATTA6C 1080 

AATTTAAGAC CABAGCCAAA TTATCCCAAG AGCATACATT CTTTTGGTTT TCCTAACTTT 1140 

GTGAAAAAAA TTGATGCAGC TGTTTTnAAC CCACQTTTTT ATAGGACCTA CTTCTTTGTA 1200 

GATAACCAGT ATTGGAGGTA T6ATGAAAGQ AGACAGATGA TGGACCCTGG TTATCCCAAA 1260 

CTGATTACCA AGAACTTCCA AGGAATCGGG 0CTAAAKTT6 ATGCAGTCTT CTACTCTAAA 1320 

AACAAATACT ACTATTTCTT CCAAG6ATCT AACCAATTT6 AATATGACTT CCTACTCCAA 1380 

OGTATCACCA AAACACTGAA AAGCAATA6C ' I ' GGTntJG ' iT GTTGAAAATG GT6TAATTAA 1440 

TGGTTTTTGT TAGTTCACTT CAGCTTAATA AGTATTTATT 6CATATTTGC TATGTCCTCA 1500 

GT6TACCACT ACTTAGAGAT ATGTATCATA AAAATAAAAT CTGTAAACCA 7AGGTAATGA 1560 

TTATATAAAA TACATAATAT TTTTCAATTT TGAAAACTCT AA3T6T0CAT TCTT6CTTGA 1620 

CTCTACTATT AAOTTTGAAA ATAG7TACCT TCAAAOCAAG ATAATTCTAT TTGAAGCATO 1680 

CTCTGTAAGT TGC7TCCTAA CATC Ci T GG A CT6AGAAATT ATACTTACTT CVGGCATAAC 1740 
TAAAATTAA6 TATATATATT TTGGCTCAAA TAAAATT6 

8eq ID NOt 28 Protein sequences 
Protein Accession fii Eos sequence 

1 11 21 31 41 51 

I I 1 1 I I 

MXFIiLIIiXtLQ ATASGAZ.PUI SSTSLEKMNV LFGERYLEKP TGLEHiKLFV TRMXYSQUM 60 

XEKZQS4QHP LGLfCVTGQU} TSTZiEKKSAF R06VFDVBEP RBKPG6PVHR KBYXTYRZHK 120 

YTPDMMREDV DYAIRKAFQV WSNVTPLKPS KINTGMADIL WFARGAHGD PHAPDGKGGI 180 

LAHAFGPGSG IGGDAHFD^ EFHTTHSGGT NLFLTAVHEI GHSIiGLGHSS DPKAVMFPTY 240 

RYVDZNTPRL SADDIRGXQS LYGDPKENQR LPHPDNSEPA USPNLSFDA VTTVGNKZFF 300 

FKDRFFfniKV SERPKTSVNL ZSSLWPTLP5 GZEAAYEIEA RNQVFLFKIK) KYWLISMZtRP 360 

EPNYPKSIBS FGFPNFVKKI DAAVFtTPRFY RTYFFVDIffQy tfRYDSRRQMH DPGYPKXiITK 420 
NFQGIGPKIO AVFYSKNKyy YPFQGSNQFB yDFXiLQRITR TUCSKSHFGC 

Seq ID KO: 29 DKA sequence 

Nucleic Acid Accession #t NM_006115.1 

Coding sequence! 236.. 1765 

1 11 21 31 41 51 

I I I I I I 

6CTTCA6GGT ACAGCTCCOC G6CAGCCAGA AGGOSGGCCT GCAGCOCCTC AGCACCX3CTC 60 

CGGGACACCC CACOOGCTTC OCA66CGTGA CCTGTCAACA GCAACTTCGC 6GTGTGGTGA 120 

ACTCTCTGAG GAAAAACCAT TTTGATTATT ACTCTCAGAC GTGOGTGGCA ACAAGTGACT 180 

GAQACCTAGA AATOCAAGCG TTOGAGGTCC TGAGGCCAGC CTAAGTCGCT TCAAAATGGA 240 

A06AAGQ0GT TT G TGG G GTT CCATTCAGAG COGATACATC AGCATGAGTG TGTGGACAAG 300 

OOCAOGGAGA CTTGTOGA6C TQSCAG6GCA 6AG0CT6CTB AA0SATGAG6 CCCTGGOCAT 360 

T6C0GCCCTG GAGTTGCTGC OCAGGGAGCT CTTCCOSCCA CTCTTCATGG CA6CCTTTGA 420 

OGGGAGACAC AGOCAGACCC TGAAGGCAAT GGTQCAGGCC TGGCCCTTCA CCTGCCTOCC 480 

TCTGGGAGTG CTGATGAAGG GACAACATCT TCACCTGGA6 ACCTTCAAAG CTGTGCTTGA 540 

TQQACTTGAT GTGCTCCTTG COCAGGAGGT T06CX:CCA6G AGGTG6AAAC TTCAAGTGCT 600 

06ATTTAC6G AA6AACTCTC ATCAGGACTT CTGGAC1GTA T6GTCTGGAA ACAGGG0CA8 660 

TCTGTACTCA TTTOCAGAGC CAGAAGCA6C TCAGCCCATG ACAAAGAA6C GAAAAGTAGA 720 

TGGTTTGAGC ACAGAGGCAG AQCAGCCCTT CATTCCAGTA GAGGTGCT06 TA6ACCTGTT 780 

CCTCAAGGAA GGTGCCTGTG ATGAATTGTT CTCCTACCTC ATTGAGAAAG TGAAGCGAAA 840 

GAAAAATGTA CTACGCCTGT GCTGTAAGAA GCTGAAGATT TTTGCAATGC CGATSCAOGA 900 

TATGAAGATG ATCCT6AAAA TGGT6CAGCT 06ACTCTATT GAAgATTTOG AAGTGACTT6 960 

TACCTGGAAG CTACCCACCT TGGCGAAATT TTCTCCTTAC CTGGGCCAGA T6ATTAATCT 1020 

GCGTAGACTC CTCCTCTCCC ACATCCATGC ATCTTCCTAC ATTTCCCCQG AGAAGGAAGA 1080 

GCAGTATATC GCCCAGTTCA CCTCTCAGTT CCTCAGTCTG CAGTGCCTGC AGGCTCTCTA 1140 

TGTOGACTCT TTATTTTTOC TTAGAGGCOG CCTGGATCA6 TTGCTCAGGC ACGTGATGAA 1200 

CCCCTTGGAA ACCCTCTCAA TAACTAACTG COGGCTTTCG 6AAGGGGATG TGATGCATCT 1260 

GTCCCAGAGT 0CCAG06TCA GTCAGCTAAG TGTCCTGAGT CTAAGT6GGG TCATGCTGAC 1320 

OGATCTAAGT COOGAGCCOC TCCAAGCTCT GCTGGAGACa GCCTCltSCCA CCCTCCAGGA 1380 

CCTGGTCTTT GATGAGTGT6 GGATCAOGGA TGATCAGCTC CTTGCCCTCC TGOCTTCCCT 1440 

GAGOCACTGC TOOCAGCTTA CAAOCTTAAG CTTCTA0G06 AATTCCATCT CCAXATCTCC 1500 
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CTT6CAGAST CTOCTGCAGC ACCTCKT0Q8 GCTGAOCAAT CTGACOCAOG T8CTGTAVCC 1560 

T G TCOCCCTG GAGAGTTAT6 JUSGACA,TOCA TOGTAOCCTC CACCT6GAGA (3GCTTGCCTA 1620 

TCTGCATGCC AGGCTCRGGG AGTTGCTGTG TCAGTTGGGG CGGCCCAGCA TGGTCTGGCT 1680 

TAGTGCCAAC CCCTGTCCTC ACTGTGGGGA CRGAACCTTC TATGACCOGG AGCOCATOCT 1740 

6TGCCCCTOT TTCATGCCTA ACTAGCTGGG TGCACATATC AAATGCTTCA TTCTGCATAC 1800 

TTGGACACTA AAGOCAGGAT 6TGCATGCAT CTTGAAGCAA CAAA6CA6CC ACA8TTTGA0 1860 

ACAAATGTTC A6TGTGAGT0 AGGAAAAOIT GTTCAGTGA6 6AAAAAACAT TCAGACAAAT 1920 

GTTCAGTC5AG GAAAAAAAGG GGAAGTTtSGG GATAGGCAQA TGTTGACTTG AGGAGTTAAT 1980 

GTGATCTTTG GGGAGATACA TCTTATAQAG TTAGAAATAG AATCTOAATT TCTAAAGGGA 2040 

GATTCTGGCT TGGGAAQTAC ATCTTAGGAGT TAATCCCTGT GTAGACTGTT GTAAAGAAAC 2100 
TGTTQUkAAT AAAGAGAAGC AATGIGAAGC AAAAAAAAAA AAAAAAAA 

Seg ID NO: 30 Protein sequence: 
Protein Accession #: NP_006106.1 

1 11 21 31 41 51 

1 I I I I I 

GCTTCAGGGT AC3lGCrCCCC CGCAGCCAGA A6CCGGG0CT GCAGCGCCTC AGCACCGCTC 60 

CGGGACACCC CACCCGCTTC CCAGGCGTGA CCTGTCAACA GCAACTTCGC GGTGTGGTGA 120 

ACTCTCTGAG GAAAAACCAT TTTGATTATT ACTCTCAGAC GTGGGTGGOl AC3W\GTGACT 180 

GASACCTAGA AATCCAAGCG TTOGAGGTCC TGAGGCCAGC CTAAGTOGCT TO^AAATGGA 240 

A06AA6G0GT TT GTO GG G TT CCATTCAGAG CCGATACATC AGCATGAGT6 TGTGGACAA6 300 

CCCA0GGA6A CTTGTGGAGC TGGCAGGGCA GAGCCTGCTG AAGGATGAGG CCCTGGCCAT 360 

TOC06CCCT6 GAGTT6CTGC CCAGGGAGCT CTTCCCGCCA CTCTTCATGG CAGCCTTTGA 420 

CGGGAGACAC AGCCAGACCC TGAAGGCAAT GGTGCAGGCC TGGCCCTTCA CCTGCCTCCC 480 

TCTGGGAGTG CTGATGAAGG GACAACATCT TCACCTGGAG ACCTTCAAAS CTGTGCTTGA 540 

TGGACTTGAT GTGCTCCTTG CCCAGGAGGT TCGCCCX3W3G AGGTGGAAAC TTCAAGTGCT 600 

GGATTTACGG AAGAACTCTC ATCW3GACTT CTGGACTSTA TGGTCTGGAA ACAGGGCCAG 660 

TCTGTACTCA TTTCC A GAGC CAGAAGCA6C TCAGCCCATG ACAAAGAA6C GAAAAGTAGA 720 

TGGTTTGAGC ACAGAGGCAG AGCAGCCCTT CATTCCAGTA GAGGTGCTCO TAGACCTGTT 780 

CCTCAAGGAA GGTGCXn^TG ATGAATTGTT CTCCTACCTC ATTGAGAAAG TGAAGCGAAA 840 

GAAAAATGTA CTAOGCCTGT GCTGTAAGAA GCTGAA6ATT TTTGCAATGC CCATGCAGGA 900 

TATCAABATG ATCCTGAAAA TGGTGCAGCT GGACTCTATT GAAGATTTGG AAGTGACTTG 960 

TACCTGGAAG CTACCCACCT TCGOGAAATT TTCTOCTTAC CTGGGCCASA TGATTAATCT 1020 

GCGTAGACTC CTCCTCTCCC ACATCCATGC ATCTTCCTAC ATTTCCCCGG AGAAOGAAGA 1080 

GCAGTATATC GCCCAGTTCA CCTCTCAGTT CCTCAGTCTQ CAGTGCCTGC AGGCTCTCTA 1140 

TGTGGACTCT TTATTTTTCC TTAGAGGC06 CCTGGATCAG TTGCTCAGGC ACGTGATGAA 1200 

CCCCTTG6AA ACCCTCTCAA TAACTAACT6 CCGGCTITOS 6AA6GGGAT6 TGATGCATCT 1260 

GTCCCAGAGT CCCAGCGTCA GTCAGCTAAG TGTCCTGAGT CTAA6TGGGG TCATOCTGAC 1320 

OGATGTAAGT CCCGAGCCCC TCCAAGCTCT GCTGGAGAGA GCCTCTGCCA CCCTCCAGGA 1380 

CCTGGTCTTT GATGAGTGTG GGATCAOGGA TGATCAGCTC CTTGCCCTCC TGCCTTCCCT 1440 

GAGCCACTGC TCCCAGCTTA CAACCTTAAG CTTCTAOGGO AATTCCATCT CXATATCTGC 1500 

CTTGCAGAGT CTCCTGCAGC ACCTCATOQQ GCXGAGCftAT CTGACCC^OS TGCTGTATCC 1560 

TGTCCCCCTG 6ABAGTTATG AGGACATCCA TGGTACOCTC CACCT66AGA Q6CTT6CCTA 1620 

TCTGCATGCC AGGCTCAGGQ AGTTGCTGTG TGAGTTGGGG CGGCCCAGCA TGGTCTGGCT 1680 

TAGTGCCAAC CCCTGTCCTC ACTGTGGGGA CAGAACCTTC TATGACCOGG AGCCCATCCT 1740 

GTGCCCCTGT TTCATGCCTA ACTAGCTGGG TGCACATATC AAATGCTTCA TTCTGCATAC 1800 

TTGGACACTA AAGCCAGGAT GTGCATGCAT CTTGAAGCAA CAAAGCAOCC ACAGTTTCAG 1860 

ACAAATGTTC AGTGTGAGTG AGGAAAACAT OTTCAGTGAG GAAAAAACAT TCAGACAAAT 1920 

GTTCAGTGAG GAAAAAAAGG GGAAGTTGGG GATAGGCAGA TGTTGACTTG AGGAGTTAAT 1980 

GTGATCTTTG GGGAGATACA TCTTATAGAG TTAGAAATAG AATCTGAATT TCTAAAGGGA 2040 

GATTCTGGCT TGG6AAGTAC ATGTAGGAGT TAATCCCTGT GTAGACTGTT GTAAAGAAAC 2100 
TGTT6AAAAT AAAGAGAAGC AATOTGAAOC AAAAAAAAAA AAAAAAAA 



Seq ID NO: 31 DNA sequence 

Nucleic Acid Accession ft: Bos sequence 

Coding sequence: 64-2754 

1 11 21 31 41 51 

I } 11 1 i ' 

GGCAGGTCTC GCTCTOGGCA CCCTCCCGGC GCCCGOSTTC TCCTGGCCCT GCCOGGCATC 60 

COGATGGCCG COGCTGGGCC CCGGOQCTCC GTGC QC OGAO CCGTCTGCCT 0CATCT6CTG 120 

CTGACCCTOG TGATCTTCA6 TCGTGATGGT GAAOOCTGCA AAAAGGTGAT ACTTAAT6TA 180 

CCTTCTAAAC TAGAGGCAGA CAAAATAATT GGCAGAGTTA ATTTG6AAGA GTGCTTCAGG 240 

TCTOCAGACC TCATCCGGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGGTCAGTO 300 

TACACAQCCA GGGCTGTTGC 6CT6TCTGAT AAGAAAAGAT CATTTACCAT ATGGCTTTCT 360 

GACAAAAGGA AACA6ACACA GAAA6A6GTT ACrGTOClOC TAGAACATCA GAAGAAGGTA 420 

TOGAAGACAA GACACACTAO AGAAACTGTT CTCAGOOOTO CCRA GACG AG ATGGGCAOCT 480 

ATTCCTTGCT CTATGCAAGA GAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAAGTT 540 

GAATCTGAT6 CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACG TGGAGTTGAT 600 

AAAGAACCTT TAAATTTGTT TTATATAGAA AGAGACACTG GAAATC TATT TT6CACTCGG 660 

CCTGTGGATC GTGAAGAATA TGATGTTTTT GATTTQATTG CTTATQCSTC AACIGCAGAT 720 

GGATATTCAG CAGATCTGCC CCTCCCACTA CCCATCAGGG TACaGGATGA AAATGACAAC 780 

CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 840 

ACTACAGTGG GGGTGGTTTG TGOCACAGAC A6AGATGAAC CGGA CACRA T GCATACGCGC 900 

CTGAAATACA GCATTTTGCA GCAGACACCA A6GTCACCTG GGCTCTTTTC TGTGCATCCC 960 

AOCACAGQCQ TAATCACCAC AGTCTCTCAT TATTTGGACA 6AGA6GTTGT A6ACAAGTAC 1020 

TCATT6ATAA TGAAAGTACA AGACATGGAT GGOCAGTTTT TTOGATTGAT AGGCACATCA 10 80 

ACTTGTATCA TAACACTAAC AGATTCAAAT 6ATAATGCAC CCACTTTCAG ACAAAAT6CT 1140 

TAT^VAGCAT TTGTA6A0GA AAATGCATTC AATGTGGAAA TCTTAOGAAT ACCTATAGAA 1200 

GATAAOaVTT TAATTAACAC TGCCAATTGG AGAGTCAATT TTACCATTTT AAAGG GAAAT 1260 

GAAAATGGAC ATTTCAAAAT CAGCACAOVC AAAGAAACTA ATGAAGGTGT TCTTTCTGTT 1320 

GTAAAGCCAC TGAATTATGA A6AAAAC0GT CAAGTGAACC TGGAAATTG6 AGTAAACAAT 1380 

GAAGCGCCAT TTGCtAGAGA TATTCCCAGA GTGACAGCCT TGAACAGAGC CTTQGTTACA 1440 

GTTCATGIGA GSQATCTG6A TGAOGGGCCT GAAT6CACTC CTGCAGCCCA ATATGTGOGG 1500 

ATTAAABAAA ACTTAGCA6T QGGQTCAAA6 ATCAACGGCT ATAAGGCATA T6ACCCCGAA 1560 
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AAXAGAAATG GCAftTGGTTT AAGGTAOAA AMVTTGOVIG ATCCIAAAG6 TTGGATCftCC 1620 

ATTGATGAAA TTTCAGGGTC AATCATAACT TCOUUUVTCC TGOATAGGGA 06TTGAAACT 1680 

CXXa^AAAATG A£?rPGTATAA TAT7ACAGTC CTGGCAATM ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TTGCTGTQAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

6AAXATG7AG TCATTTGCAA ACCAAAAATG G6GTATA00G ACATTTTAGC TGTTGATCCT 1860 

GATCAACCIG TCCATGGAGC TCCATTTZAT TTCASITTGC CC3UITACTTC TCCAOAAATC 1920 

AGTAGACTGT GGA6CCTCAC CAAAGTTAAT GAZACA6CTG CCOGTC rf TC ATATCA6AAA 1980 

AATGCTGGAT TTCAAGAATA TACCATTCCT ATTACTGTAA AAGACAC3GGC OGGCCAAGCT 2040 

GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCX3TGCG 2100 

ACTTOUUSGA GTACAGGAGT AATACTTGGA AAATG6GCAA TCCTTGCAAT ATTACT6G6T 2160 

ATAGCACTQC T Crm xrT tf f ATTGCTAACT TTASTATGTQ GAGTTTTTGG TGOkACTAAA 2220 

OGGAAAOGTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA C31CAGAA6CA 2280 

CCTGGAGACG ATAGAGTGTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTGTGGTAC TATOGGATCA GOAATGAAAA ATGGAGGGCA GGAAACCATT 2400 

GAAAIGAZGA AAGGAGGAAA CCAGAOCTTG GAATCCTGCC GGGG6GCTGG GCATCATCAT 2460 

ACOCTGGACr OCXOCAOQQO AGOACACAOS GA66T6GACA ACT6CAGATA CACTTACT06 2S20 

GAGTGGCACA GTTTTACTCA ACCXX9GTCTC GGT(»AAAAT TGCAT0GAT6 TAATCAfiAAT 2580 

GAAGACCGCA TGCCATCCCA AGATTATGTC CTCACTTATA ACTATGAGOO AAGAGGATCT 2640 

CCAGCTGGTT CTGTGGGCTG CTGCAGTGAA AAGCAGGAAG AAGATGGCCT TGACTTTTTA 2700 

AATAATTTGG AAOCCAAATT TATTACATTA GCAGAAGCAT GCACAAAGAG ATAATGTCAC 2760 

AGTGCTACAA irAGGTCTTT GTCAGACATT CTGGAOGTTT CCAAAAATAA TATTGTAAAG 2820 

TTCAATTTCA ACATGTATGT ATATGATCAT TTTTTTCTCA ATTTTGAATT ATGCTACTCA 2880 

CCAATTTATA TTTTTAAAGC CAGTTGTTGC TTATCTTTTC CAAAAAGTGA AAAATGTTAA 2940 

AACAGACAAC TGGTAAATCT CAAACTCCAG CACTGGAATT AA6GTCTCTA AAGCATCTGC 3000 

TCTTTTTTTT TTTTACGGAT ATTTTJ^AA TAAATATGCT G6ATAAATAT TAGTCCAACA 3060 

ATAGGTAAGT TATGCTAATA TCACATTATT ATGTATTCAC TTTAAGTGAT AGTTTAAAAA 3120 

ATAAACAAGA AATATTGAGT ATCACTATGT GAAGAAAGTT TTGGAAAAGA AACAATGAAG 3180 

ACTGAAITAA ATTAAAAATG TTGCAGCTCA TAAAGAATTG GGACTCAOCC CTACTGCACT 3240 

ACCyUVATTCA TTTGACTTTG GAGGCAAAAT GTGTTGAAGT GCCCTATGAA GTAGCAATTT 3300 

TCTATAGGAA TATAGTTGGA AATAAATGTG TGTGTGTATA TTATTATTAA TCAATGC31AT 3360 

ATTTAAAATG AAATGAGAAC AAAGAGGAAA ATGOTAAAAA CTTGAAATGA GGCTGGGGTA 3420 

TAGTTTGTCC TACAATAGAA AAAAQA6AGA 6CTTCCTA0G CCP5G6CTCT TAAATGCTGC 3480 

ATTATAACTO AGTCTATGAG GAAATAOTTC CIGTOCAATT T6TGTAATTT GTTTAAAATT 3540 

GTAAAXAAAT TAAACTTTTC ' IWl T ifl^T 66GAAGGAAA TAGGGAATCC AATGGAACA6 3600 

TAGCTTTGCT TTGCAGTCTG TTTCAAGATT TCTGCATCCA CAAGTTAGTA QCAAACTG6G 3660 

GAATACTCGC TGCAGCTGGG GTTCCCTGCT TTTTGGTAQC AAGGGTCCAO AGATGAGGTG 3720 

TTTTTTTOGG GGAGCTAATA ACAAAAACAT TTTAAAACTT ACXITTTACTG AAGTTAAATC 3780 

CTCTATTOCT GTTTCTATTC TCTCTTATAO TGACCAAOIT CTTTTTAATT TAGATCCAAA 3840 

TAACCATGTC CTCCTAGAGT TTAGA66CTA GA6QQA6CTG AG6GGA6GAT CTTACTGAAA 3900 

GCACCCTGGG GAGATTGATT GTCCTTAAAC CTAAGCCCCA CAAACTTGAC ACCTGATCAG 3960 

GTCTCGGAGC TACAAAATTT CATTTTTCTC CTCACTCCCC TTCTTCXIGAG TGGCATTGGC 4020 

CTGAATCAAG GAAAGCCAGG CCTTOTGGGC CCCCTTCTTT OGGCTTTCIC CT AAAGC AAC 4080 

ACCTCCAGCA 6AGATTCCCT TAAGTOACTC CAC3GTTTT0C AOCATOCTTC AGGGT6AATT 4140 

AATTTTTAAT CAGTTTGCTT TCTCCAGAfiA AATTTTAAAA TAATASAAGA AATAGAAATT 4200 

TTGAATGTAT AAAAGAAAAA GATCAAGTTG TCATTTTAGA ACAGAOGGAA CTTT6GGA6A 4260 

AAGCAGCCCA AGTAGGTTAT TTGTAC3W3TC AQAGGGCAAC AGGAAQATOC AGGCCTTCAA 4320 

GGGCAAGGA6 AGGCCACAAG GAATATGGGT GGGAGTAAAA GCAACATOGT CTGCTTCATA 4380 

CTTTTTOCTA OQCTTSGCAC I' UC C m i'CC TTTCIGAG6C CAATGQCAAC TOCCATTltSA 4440 

GTOOG G TGAQ G6ATCAG0CA AOCTCTTCTC TATGGCTCAC CTTATTTOGA GTGAiSiAATC 4500 

AAGGAGACA6 AGCTGACTGC ATGATGAGTC TGAAGGCATT TGCS^GQATGA GCCTGAACTG 4560 

GTTGTGCAGA ACAAACAAGG CATTCATGGG AATTGTTGTA TTCCrTCTGC AGCCCTCCTT 4620 

CTGGGCACTA AGAAGGTCTA TGAATTAAAT GOCTATCTAA AATTCTGATT TATTCCTACA 4680 

rmtTtfrn ' tctaatttga cgctaaaatc tatgtgtttt agacttagac tttttattgc 4740 

CCCCCCOCCC ■ rm ' T TT' n q AGAOQGAGTC TCQCTCTGAC GCACAGGCTG GAGTGCAGTG 4800 

GCTCOSATCT CTGCTCACTG AAAGCTCCGC CTCCOGGGTT CATGCCATTC TCCTGCCTCA 4660 

GCCTCCTGAG TAGCTGGGAC TACAGGCGCC CACCACCACX3 CCCGGCTAAT TTTTTGTATT 4920 

TTTAATAGAG ACGGGGTTTC ACTGTGTTAG CCAG8ATGGT CTOSATCTCC TOACXTTCSTG 4960 

ATC06CCTGC CTCGGCCTCC CAAAOIGCTG QGATTAGAOO CATGACCCAC 0GCT0C060C 5040 

CTTGTTTTCC GTTTAAAGTC U T C mnm ' AATOTAATCA TTTT6AACAT GTOTGAAAGT 5100 

TGATCATAC6 AATTGGATCA ATCTTGAAAT ACTCAACCAA AAGACAGTCG AGAAGCCA66 5160 

GGGAGAAAGA ACTCAGGGCA CAAAATATTG GTCTGAGAAT GGAATTCTCT GTAAGCCTAG 5220 

TT6CTGAAAT TTCCTGCTGT AACCAGAAGC CAGTTTTATC TAACQGCTAC TGAAACACCC 5280 

ACTGTGTTTT GCTCACTCCC TCACTCACCa ATCAAAACCT GCTACCTCCC CAAGACTTTA 5340 

CTAGTGOOSA TAAACTTTCT CAAAGA6CAA CCAOTATCAC TTCOCTGTTT ATAAAACCTC 5400 

TAACCATCTC TTTGTTCTTT GAACATGCTG AAAACCACCT GGTCTGCATG TATGCCOGAA 5460 

TTTCTAATTC mTCTCTCA AATGAAAATT TAATTTTAGG GATTCATTTC TATATTTTCA 5520 

CATAT6TAGT ATTATTATTT CCTTATATGT GTAAGGTYSAA ATTTATGGTA TTTGAGT6TO 5580 

CAAGAAAATA TATTTTTAAA GCTTTCATTT TTOOCCCAGT GAATGATTTA GAATTTTTTA 5640 

T6TAAATATA CA6AATGTTT TTTCTTACTT TTATAAOGAA GCA6CTGTCT AAAAT6CAGT 5700 

GG6GTTTGTT TTGCAATGTT TTAAACAGAO TTTTAGTATT GCTATTAAAA GAAGTTACTT 5760 

TGCTTTTAAA GAAACTTGGC TGCTTAAAAT AAGCAAAAAT TGGATGCATA AAGTAATATT 5820 

TACAGATGTG GGGAGATGTA ATAAAACAAT ATTAACTTG6 ■ m ' C ' l ivm TTGCTGTATT 5880 

TAGAGATTAA ATAATTCTAA 6 AJSATCACT TT8CAAAAT7 AIGCTTATGG CTGGCATG6A 5940 

AATAGAAATA CTCAATTATG TCTTTGTTGT ATTAATGGGO AATATTTT6Q ACAATGTTTC 6000 

ATTATCAAAT TGTCXaACATC ATTAATATAT ATTGTAATGT TGG6AAGAGA TCACTATTTT 6060 

GAAGCACAGC TTTACAQATG AGTATCTATG ATACATATGT ATAATAAATT TTGATOGGGT 6120 

ATTAAAAGTA TTAGAAGGT6 GTTATAATTG CAGAGTATTC CATOAATAGT ACACTGACAC 6180 

AGGGGTTTTA CTTT6A0GAC CAGTGTAGTC AAGOQAAAAC ATOAOTTAAA AAGAAAAGCA 6240 

GGCAATATTG CAGTCTTGAT TCTGCCACTT ACAOGATAGA TAAT6CCTGA ACTTTAATGA 6300 

CAAGATGATC CAACCATAAA GGTQCTCTGT GCTTCACAGT GAATCTITTC CCCATGCAGG 6360 

AGTGTGCTCC CCTACAAAOG TTAA6ACTGA TCATTTCAAA AATCTATTAG CTATATCAAA 6420 

AGCCTTACAT TTTAATATAG GTTGAACCAA AATTTCAATT OCAGTAACTT CTATTGTAAC 6480 

CATTATTTTT GTGTATGTCT TCAAGAAT6T TC3 1TT6GATT TTTGrTTGTA ATAGTAAAAT 6540 

ACOGGATACA TTTCAOGTGT OCTTCAGTAT TGATTTGGTT (SUVTATTGG6 TCATAATGGT 6600 

TGAGAAGCAT GGACACTAGA 6CCAGAATGC TTGQATATQA ATCCTOQATC TGTCACTTAC 6660 

TTCTGTGTGA CCTTTGAAAG GCTACTTATT TCCTCTCTTA GCTTTCTCAT TAAAATCAAT 6720 

6AACAAT0CC AGOCTCATGG G6TTGTTGAA TGATIAAATT AGTTAATAXA CCZAAAGTAC 6780 
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ATAGAAOICT GCCTGCACAT ABTAAAAGM TTATAAGT6T GAGGTAGTTG GTAAAATTAT 6840 

6TAGTTG6AT ATACTACOGA ACAATATCTA ATCTCITTTT AGGGAAAJAA AGTTTGIOCft 6900 
TATATATAAT CCCX3UVACAT G 

Seq 10 HO: 32 Protein sequence: 
Protein Accession i: NP_OOI932.1 

1 IX 21 31 41 51 

I I I i I ) 

MAAAGPRRSV RGAVCLHLliL TLVIFSSDGE ACKKVILNVP SKLEADKIIG RVNLE5CPRS 60 

ADLIRSSDPD FRVLHDGSVY TARAVALSDK KRSFTIWLSD XRKQTQREVT VXiLEHQKKVS 130 

KTSHTRSTVL RRAKRRWAPI PCSKQQfSIiG PFPLFIiQQVE SDAAQNYTVF YSI9GRCVDK 180 

EPUUiFYIER DTGNLPCTRP VDREEHfDVFD LIAYASTADG YSADLPLPLP IRVEDSNDNH 240 

PVPTEAIYNF EVLBSSRPGT TVGWCATDH DEPDTMHTRL KYSILQQTPR SPC3LPSVHPS 300 

TCVITTVSHY LDREWDKYS LIMKVQDMDG QFFOLIGTST CIITVTDSND KAPTFRQNAY 360 

EAPVEOJA™ VEILRIPIED KDLINTANWR VNFTILKGNE NGHFKISTDK ETMEGVLSW 420 

KPllTYEENRQ VNLEIGVNNS APFARDIPRV TALHRALVTV HVRDLDEiGPB CTFAAQYVRI 480 

KEtniAVGSKI NGYKAYDPEN RMGKGIiRYKK LHDPKGWITX DSISGSIITS KILDREVETP 540 

KNEliYKITVIi AIDKDDRSCT GTLAVNIEDV NDNPPEZLQS YWICRPKMO YTOILAVDPD 600 

EPVHGAPFYP SLPNTSPEIG RLHSLTICVKD TAARLSYQKH AGPQBYTIPI TVKDRAGQAA 660 

TiOiIiRVNLCE CTHPTQCRAT SRSTGVILGK WAILAILLGI ALLPSVLLTL VCGVFGATKG 720 

KRFFEDLAQQ NLIZSNTEAP GDDRVCSAKO FMTQTTNKSS QGFOGTMSSG MKNGGQBTIE 780 

KMKGGNQTIiS SCSGA6BHHT LDSCSGGRTB VDNCRYTYSB HH5F7QPRLG EXLBRaSQtiB 840 
DRMPSQDYVIi TyNYEGRGSP AGSVGCCSEK QBSDGXtDFW NLEPKFZTLA EACTKR 



Seq ID KO: 33 DNA sequence 

iTucleic Acid Accession #: Eos sequence 

coding sequence: 64-2583 

1 11 21 31 41 51 

I I I I I I 

GGCAG6TCTC GCTCTGQQCA CCCTCCOQOC GCXrOGCGTTC TCCTGGCCXT GCCCGGCATC 60 

COGATGGCOQ CCGCTGGGCC C08Q06CT0C GTGOGOGGAG COGTCTGCCT GCATCTGCTG 120 

CTGAOXrrCG TGATCTTCAG TOGTGATGGT GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 180 

CCTTCTAAAC TAGAGGCAGA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCAGG 240 

TCTGCAGACC TCATCCGGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGGTCAGTG 300 

TACACAGC3CA GG6CP8TT6C GCTOTCTGAr AAGAAAAGAT GATTTAOCAT A106C77TCT 360 

GACAAAAGGA AACAGACACA GAAAGAGGTT ACTGrTGCTGC TAGAACATCA GAAGAAGGTA 420 

TCGAAGACAA GACACACTAG AGAAACTGTT CTCAGGCGTG CCAAGAGGAG ATGGGCACCT 480 

ATTCCTTGCT CTATGCAAGA GAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAA6TT 540 

GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACX3 TGGAGTT6AT 600 

AAAGAACCTT TAAATTIGTT TTATATAGAA AGAGACACTC GAAATCXATT TT6CACTO0G 660 

CCTGTGGATC GT6AAGAATA TGATGTTTTT GATTTGATTG CTTAT60GTC AACTGCAGAT 720 

GGATATTCAO CAGATCTGCC CCTCCCACTA CCCATCAGGG TAGAGGATGA AAATGACAAC 780 

CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 840 

ACTACAGTGG GGGTGGTTT6 TGCCACAGAC AGAGATGAAC CXSGACACAAT GCATAOGOGC 900 

CTGAAATACA 6CATTTTGCA GCA6ACACCA AOOTCACCTO G6CTCTTTTC TOTGCATCOC 960 

AGCACAGGCG TAATCACCAC AGTCTCTCAT TATTTGGACA GA6AGGTTGT AGACAAGTAC 1020 

TCATTGATAA TGAAAGTACA AGACATGGAT 6GCCAGTTTT TTGGATT(SVT AGGCACATCA 1080 

ACTTGTATCA TAACAOTAAC AGATTCAAAT GATAATGCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGA6GA AAATGCATTC AATGTGGAAA TCTTAC6AAT ACCTATAGAA 1200 

GATAAGGATT TAATTAACAC TGOCAATTGG AGAGTCAATT TTACCATTTT AAAGGSUUIT 1260 

(»AAATOGAC ATTTCAAAAT CAGCACAGAC AAAGAAACTA AT6AAGGTGT TCTTTCTGTT 1320 

6TAAAGCCAC TGAATTATGA AGAAAACOGT CAAGTGAACC TGGAAATTGQ AGTAAACAAT 1380 

GAAGC6CCAT TTGCTAGAGA TATTCCCAGA 6TGACAGCCT TGAACAGAGC CTT6GTTACA 1440 

GTTCATGTGA GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCOCA ATAT6TGCX3G 1500 

ATTAAAGAAA ACTTAGCA6T GGGGTCAAA6 ATCAA06GCT ATAAOGCATA TGACCCOGAA 1560 

AATA6AAATG GCAATGGTTT AAGGTACAAA AAATTGCAT6 ATCCTAAAGO TTG6ATCACC 1620 

ATTGATGAAA TTTCAGGGTC AATCATAACT TCCAAAATCC TGGATAGGGA GGTTGAAACT 1680 

CCX:aAAAATG AGTTGTATAA TATTACAGTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TT G CTGTGAA CATTGAA6AT GTAAATGATA ATCXZAGCAGA AATACTTCAA 1800 

GAATATGXAO TCATTT6CAA ACCAAAAA70 6GGTATAC0G ACATTTTAOC TGTT6ATCCT 1860 

6ATGAACCTG TOCATGGAGC TCCATTTTAT TTCA6TTTGC CCAATACTTC TCCAG9UVATC 1920 

AGTAGACTGT GGAGCCTCAC CAAAGTTAAT GATACAGCTG COOGTCTTTC ATATCAGAAA 1980 

AATGCTGGAT TTCAAGAATA TACCATTCCT ATTACTGTAA AAGACAG6GC CGGCCAAGCT 2040 

GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCGTGOQ 2100 

ACTTCAAG6A 0TACA08AGT AATACTTGGA AAATGGGCAA T0CTT6CAAT ATTACTGGGt 2160 

ATAGGACT6C TCTTTTCrGT ATTGCTAACT TTAGTATGT6 GAGTTTTTGG TGCAACTAAA 2220 

GGGAAAOGTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAAGCA 2280 

CCTGGAGACG ATAGAGTGTG CTCTGCCAAT GGATTTAT6A CCCAAACTAC CAACAACTCT 2340 

AOCCAAGGTT TTTCTGGTAC TATGGGATCA G6AATGAAAA ATGGA6GGCA GGAAACCATT 2400 

GAAATGATGA AA6GAGGAAA CCAGACCTTG GAATCCTGCC G6GG66CTGG GCATCATCAT 2460 

ACCCTGGACT CCTGCAGGGG AGGACACAOS GAGGT6GACA ACIGCAGATA CACTTACTQG 2520 

GAGTGGCACA GTTTrACTCA ACCCCGTCTC GGTGAAGAAT CCATTAGAGG ACACACTGGT 2580 

TAAAAATTAA ACATAAAAGA AATTGCATOS ATGTAATCAG AATGAAGACC GCATGCCATC 2640 

CX3UIGATTAT GTCCTCACTT ATAACTATGA GGGAAGAGGA TCTOCAGCTG 6TTCTGTQ0G 2700 

CT6CTGCAGT 6AAAA0CAGG AA6AAGATGG CGTTGACTTT TTAAATAATT TGGAACCCAA 2760 

ATTTATTACA TTAGCAGAAG CATGCACAAA GAGATAATGT GACAGTGCTA CAATTAGGTC 2820 

TTTGTCAGAC ATTCPGGAGG TTTCCAAAAA TAATATTGTA AAGTTCAATT TCAACATX5TA 2680 

TGTATATGAT GATTTTTTTC TCAATTTTGA ATTATGCTAC TCAOCAATTT ATATTTTTAA 2940 

AGCCAGTTGT TGCTTATCTT TTCCAAAAAG TGAAAAATGT TAAAACAGAC AACTGGTAAA 3000 

TCTCAAACTC CAGCACTGGA ATTAAGGTCT CTAAAGCATC TGCTCTTTTT TTTTTTTAOG 3060 

GATATTTTAO TAATAAATAT GCTGGATAAA TATTAGTCCA AC3UVTAGCTA AGTTATGCTA 3120 

ATATCACATT ATTATGTATT CACTTTAAGT GATAGTTTAA AAAATAAACA AGAAATATTG 3180 

AGTATCACTA TGTGAAGAAA GTTTTGGAAA AGAAACAATG AAGACTGAAT TAAATTAAAA 3240 

A1GTT6CAGC TQkZAAAGAA TTGGGACTCA CCXX:TACTGC ACTACCAAAT TCA37TGACT 3300 
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TTG6AGGCAA AAT6T6TTGA AGTGCOCTAT GAAGTAGCAJ^ TTTTCIATA6 GAftTATAGTT 3360 

GGAAATAAAT GT G TGTGTGT AXATTATTAT TAATCAATGC AATATTTAAA ATGAAATGAS 3420 

AACAAAGAG6 AAAATGGTAA AAACTTGAAA TGAGGCTGG G GTATAGTTTG TCCIACAATA 3480 

GAAAAAAGAG AGAGCTTCXrr AGGCCTGGGC TCTTAAATGC TGCATTATRA CTGAGTCTAT 3540 

GAG(3AAATftG TTCCTGTCCA ATTTGTGTAA TTTGTTTAAA ATTGTAAATA AATTAAACTT 3600 

TTCZOST3TC TGlGGG R flflG AAATAGGBAA TOCAATGGAA CAG2A6C7T7 GCTTTGCA6T 3660 

CTGTTTCAAG ATTTCTGCAT CCkCMJ3T£A GTAGCAAACT GGGGAATACT CGCTGCASCT 3720 

GGGGTTCCCT GCTTTTTQtST AGCAAGQGTC C3USAGATGAG (JlVl rmTl ' CGGGGAGCTA 3780 

ATAACAAAAA CATTTTAAAA CTTACCTTTA CTGAAGTTAA ATCCTCTATT GCXX3TTTCTA 3840 

TTCTCTCTTA TAGTGACCAA CATCTTTTTA ATTTAGATCC AAATAAOCAT GTCCTCCTAG 3900 

AGTTTAGAGG CTAGAGGQAG CTGAGGG6AG GATCTTACTG AAAGCACCCT GGGGAOATTG 3960 

ATTGTOCTTA AAOCTAAGCC CCACAAACTT OACAOCrSAT CA66TCTOG6 AGCTACAAAA 4020 

TTTCATTTTT CTCCTCACTG CCCTTCTTCT GA6TGGCATT GGCCTGAATC AAGGAAAGCC 4080 

AGGCCTTGTQ GGCCOCCTTC TTTOGGCTTT CTGCTAAAGC AACACCTCCA GCAGAGATTC 4140 

CCTTAAGTGA CTCCAGGTTT TCCACCATGC TTCA606TGA ATTAATTTTT AATCA6TTTG 4200 

CTTTCTCCAG AGAAATTTTA AAAXAATAGA AGAAAXAGAA ATTTTGAAT6 XATAAAAGAA 4260 

AAA6ATCAAG TTGTCATTTT AGAACAGAOG GAACTTTG6G AGAAAGCAGC CCAAGTAGGT 4320 

TATTTGTACA GTCAGAGGGC AACAGGAAGA TGCAGGCCTT CAAGGGCAAG GAGAGGCCAC 4380 

AAGGAATATG GGTGGC3AGTA AAAGCAACAT OGTCTGCTTC ATACTTTTTC CTAGGCTTGG 4440 

CACTGCCTTT TCCTTTCTCA GGCCAATG6C AACTGOCATT TGAGTCC6GT GAGGGATCAG 4500 

CCAACCTCTT CTCTATGGCT CACCTTATrr GGAGT6AGAA ATCAAG6AGA CAGAGCTGAC 4560 

TGCATGATGA GTCTGAAGGC ATTTGC31GGA TGAGCCP6AA CTGGTTGTGC AGAACAAACA 4620 

AGGCATTCAT GGGAATTGTT GTATTCCTTC TGCAGCOCTC CTTCTGGGCA CTAAGAAGGT 4680 

CTATGAATTA AATGCCTATC TAAAATTCTG ATTTATTCCT ACATTTTCTG TTTTCTAATT 4740 

TGACCCTAAA AXCTATGTGT TTTAGACTTA GACTTTTTAT TGCCCCCCCJC CCCTTTTTTT 4800 

TTGAGAOSQA GTCTOGCTCT GACGCACAGG CTGGAGTGCA GTGGCTCCGA TCTCTGCTCA 4S60 

CTGAAAGCTC CGCCTCCCJGG GTTCATGCCA TTCTCCT6CC TCAGCCTCCT GAGTAGCTGQ 4920 

GACTACAGGC GCCCACCACC ACGCCCGGCT AATTTTTTGT ATTTTTAATA GAGACGOGGT 4980 

TTCACTGTGT TAGCCAGGAT GGTCTOSATC TCCTGACX:TC GTGATCCGCC TGCCTCGGCC 5040 

TCCCAAAGTC CTGGGATTAC AGGCATGACC CACCGCTCCC GGCCTTGTTT TCCX5TTTAAA 5100 

GTCQTCTTCT TTTAATGTAA TCATTTTGAA CATGTGTGAA AGTTGATCAT ACGAATTGGA 5160 

TCAATCTTQA AATACTCAAC CAAAAGACAG TCGAGAAGCX: AGGGGGAGAA AGAACTCAGG 5220 

GCACAAAATA TT G GTCT G AG AATGGAATTC TCTGTAAGCC TAGTTGCTGA AATTTCCTGC 5280 

TSTAACCAGA AGCCAGTTTT ATCTAACGGC TACTGAAACA CCCACTGTGT TTTGCTCACT 5340 

CCCACTCACC GATCAAAACC TGCTACCTCC CCAAGACTTT ACTAGTGCOS ATAAACTTTC 5400 

TCAAAGAGCA ACCAGTATCA CTTCCXTTGrT TATAAAACCT CTAACCATCT CTTTGTTCTT 5460 

TGAACATGCT GAAAAOCACC TGGTCTGCAT GTATGCCCGA ATTTGTAATT CTTTTCTCTC 5520 

AAATGAAAAT TTAATTTTAG 66ATTCATTT CTATATTTTC ACATATGTAG TATTATTATT 5560 

TCCTTATAT6 TGTAAGGT6A AATTTATGGT ATTTGA6TGT GCAAGAAAAT ATATTTTTAA 5640 

AGCTTTCATT TTTOCCCCAG TGAATOATTT AGAATTTTTT ATGTAAATAT ACA6AAT6TT 5700 

TTTTCTTACT TTTATAAGGA AGOUSCrGTC TAAAATGCAG TGGGGTTTGT TTTGCAATGT 5760 

TTTAAACAGA GTTTTA6TAT TQCTATTAAA AGAAGTXACT TTGCTTTTAA AGAAACTTGG 5820 

CT6CTTAAAA TAAGCAAAAA TTG6ATGCAT AAA6TAATAT TTACAGATGT GGQGAGATOT 5880 

AATAAAACAA TATTAACTTG 6CTGCTTAAA ATAAGCAAAA ATTG6ATGCA TAAASTAATA 5940 

TTTACAGATG TGGGGAGATG TAATAAAACA ATATTAACTT GGTTT C T TG T TTTTGCIGTA 6000 

TTTAGAGATT AAATAATTCT AAGATGATCA CTTTGCAAAA TTATQCTTAT GOCTGGCATG 6060 

GAAATAGAAA TACTCAATTA TGTCTTTGTT GTATTAATGG GGAATATTTT GGACAATGTT 6120 

TCATTATCAA ATTGTOQACA TCATTAATAT ATATTQTAAT GTTGQGAA6A GATCACTATT 6180 

TTGAAGCACA GCTTTACAGA T6AGTATCTA TGATACATAT GTATAATAAA TTTTGATGOQ 6240 

GTATTAAAAG TATTAGAAGG TGGTTATAAT TGCAGAGTAT TCCATGAATA GTACACTGAC 6300 

ACA6GGGTTT TACTTTGAOG ACCA6TGTAG TCAAGGGAAA ACATQAGTTA AAAAGAAAAG 6360 

CAGGCAATAT TGCAGTCTTG ATTCTGCCAC TTACAOGATA GATAATGCCT GAACTTTAAT 6420 

GACAAGATGA TCCAACCATA AAGGTGCTCT GTGCTTGACA GIGAATCTTT TCCCCATGCA 6460 

G6AGTGT6CT CGCCTACAAA OGTTAAGACT GATCATTTCA AAAATCTATT A6CTATATCA 6540 

AAAGCCTTAC ATTTTAATAT AGGTTGAACC AAAATTTCAA TTCC3W3TAAC TTCTATTGTA 6600 

ACCATTATTT TTGTGTATGT CTTCAAGAAT GTTCATTGGA TTTTTGTTTG TAATAGTAAA 6660 

ATACOQGATA CATTTCAOGT GTCCTTCAGT ATTGATTTGG TTGAATATTG G6TCATAATG 6720 

GTTGAGAAGC ATG6ACACTA GAGCCAGAAT OCTTGGATAT 6AATCCTGGA TCTGTCACTT 6780 

ACTTCTGTOT GACCTTTOAA AG6CTACTTA TTTCCTCTCT TAGCTTTCTC ATTAAAATCA 6640 

ATGAACAATG CCAGCCTCAT GGGGTTGTTG AATGATTAAA TTAGTTAATA TACCTAAAGT 6900 

ACATAGAACA CTGCCTGCAC ATAGTAAAAG AATTATAAGT GTGAGGTAGT TGGTAAAATT 6960 

ATGTAGTTGG ATATACTACC GAACAATATC TAATCTCTTT TTAGGGAAAT AAAGTTTGTG 7020 
CATATATATA ATOCOQAAAC ATG 

Seq ID HO: 34 Protein sequence: 
Protein Accession #: 23P_077741.1 

1 11 21 31 41 51 

I I I I I I 

MAAAGPRRSV RGAVCLHLLL TLVIFSRDGB ACKKVILNVP SKLEADKIIG RVNLEECFSS 60 

ADLIRSSDPD PRVLHDGSVY TARAVALSDK KRSFTIHLSD KRKQTQKEVT VLLEHQKKVS 120 

XTRBTRETVL RRAKRRMAPZ PCSNQENSL6 PFFliFLQQVS SOAAQNYTVF YSISGR6VDK 180 

EPLHIiFYIBR DTQMIiFCTRP VDREBYDVFD LIAYASTADG YSADtiPIiPItP ZRVBDOIDNH 240 

PVFTBAXW EVLBSSRP6T TVGWCATDR DBPDTMHTRL inrSZLQQTP& SP6LFSVHPS 300 

T6VITTVSHY LDREWDKYS LIMKVQDMDG QFFGLIGTST CIITVTDSND KAPTFRQNAY 360 

EAFVSEHAFN VEILRIPIED KDLINTAITHR VNFTIIjKCaiE NGSFKISTDK ETNBGVXiSW 420 

KPLHYEENRQ VNLBXGVKKE APFARDIPRV TAlMRAhVTV BVRDLDE3GPE CTFAAQYVRI 480 

KBHLAVGSKZ MGYKAYDFEV RNOIGtiRyKK LBDPSBWZTI DEISGSXITS KILOREVETF 540 

KKELYNITVL AIDKDDRSCT GTLAVNIEDV NDKPFEIXiQE YWICKFKMG YTDILAVDPD 600 

BPVHGAPPYF SLPNTSPEZS TOMSLTK^Jm) TAARLSYQKN A6FQEYTZPI TVKDSAGQAA 660 

TKLLRVKLCE CTUPTQCRAT SRSTGVZLGK HAZLAZIiLGI ALLPSVLLTL VCGVFGATKG 720 

KRFPEDLAQQ NLZZSNTEAP QJDRVCSAHG FMTQTTHHSS QGFCXSTMGSG MKNGGQETZE 780 
MKRSGKQTLB SCR6AGRRBT LOSCBGGBTB VDNCSnrTYSB HESFTQSHLG BBSZRGHTG 

Seq ZD MO; 35 DHA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 146-1273- 
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1 11 21 31 41 51 

1 I I I I I 

GGGAGTGGGC GTGGOGGTGC TGCCCAGGTQ AGCX31CCGCT GCTTCTCCCC AGACAOGGTC 60 

GCCTCCACAT CCAGGTCTTT GTGCTCCTOQ C T lXSCCl W r CCTTTTOCAC GCATTTTCCA 120 

OGftnU^CIGT GACTCCAGGC COGCAkTOGA TGCOCTGCMV CTA6CAAATT GGGCTTTTGC 180 

CSSrCGHTCTG TTCWUVCMC TATGT6AAAA OGAGCGACTG GGCAATGTCC TCTTCTCTCC 240 

AATCTGTCTC TCCACCTCTC TGTCACTTGC TCAAGTGGGT GCTAAAGGTG ACACTGCAAA 300 

TGRAATTGGA CAGGTTCTTC ATTTTGAAAA TGTCAAAGAT ATACCXTl'lti GATTTCAAAC 360 

AGTAACATG6 GATGTAAACA AACTTAGTTC CTTTTACTCA CTGAAACTAA TCAAGCGGCT 420 

CTAOGTAOAC AAATCTCTGA ATCTTTCTAC AGAGTTCATC AGCTCT A OGA A6AGACCCTA 480 

TQCAAA06AA TTGQAAACTG TTGACTTCAA A6ATAAATTQ GAAGAAAOGA AA6GTCAGAT 540 

CAACAACTCA ATTAAGGATC TCACAGATGG CCACTTTGAG AACATTTTAG CTGACAACAG 600 

TGTGAACGAC CAGACCAAAA TCCTTGTGGT TAATGCTGCC TACTTTGTTG GCAAGTGGAT 660 

GAAGAAATTT CCTGAATCAG AAACAAAAGA ATGTCCTTTC AGACTCAACA AGACAGACAC 720 

CAAACCAGTG CAGATGATGA ACATGGAGGC CAOGTTCTGT ATGOGAAACA TTGACAGTAT 780 

CAATTGTAAG ATCATAGAOC TTOCTTTTCA AAATAAGCAT CTCAGCATGT TCATCCTACT 840 

A00CAAG6AT GTGGAGGATG AGTCCAC3\GG CTTGGAGAAG ATTGAAAAAC AACTCAACTC 900 

AGAGTCACTO TCACACTGGA CTAATCCCAG CACCATOSCC AATGCCAAGG TCAAACTCTC 960 

CATTCCAAAA TTTAAGGTGG AAAAGATGAT TGATCCCAAG GCTTGTCTQG AAAATCTAGG 1020 

GCTGAAACAT ATCTTCAGTG AAGACACATC TGATTTCTCT GGAATCTCAG AGACCAAGGG 1080 

AGTGGCCCTA TCAAAT6TTA TCCACS^AAGT GTGCTTAGAA ATAACTGAAO ATGGTGGGGA 1140 

TTGCATAQAa GTGCCAGGAG CAC6GAT0CT 6CAGCACAAG GATGAATTGA AT6CTGACCA 1200 

TCOCTTTATT TAGATCATCA GGCACAACAA AACTOSAAAC ATCATTTTCT TTGGCAAATT 1260 

CiXaiTCA'CCT TAAGTGGCAT AGCXXATGTT AAGTOCTCCC TGACTTTTCT GTGGATGCOO 1320 

ATTTCTGTAA ACTCTGCATC CAGAOVTTCA TTTTCTAGAT ACAATAAATT GCTAATGTTC 1380 

CTGGATCAG6 AAGCCGCCAG TACTTGTCAT ATGTAGCCTT CACACAGATA GACCTTTTTT 1440 

TTTTTCCAAT TCTATCTTTT tfmwrm TTCCOITAAG ACAATGACAT AOGCTTTTAA 1500 

TGAAAAGGAA TCAC3GTTAGA G6AAAAATAT rTATTCATTA TTTGTCAAAT TGTCOGGGGT 1560 

AGTTGGCAGA AATACAGTCT TCCACAAAGA AAATTGCTAT AAGGAAGATT TGGAAGCTCT 1620 

TCTTCCCAGC ACTATGCTTT CXTPTCTTTGO GATAGAGAAT GTTCC3U3ACA TTCTCGCTTC 1680 

CCTGAAAGAC TGAAGAAAGT GTAGT6CATG GGACCCACX3A AACTGCXTTG GCTCCAGTGA 1740 

AACTTGGGCA CAT6CTCA6G CTACTATAGG TCCAGAAGTC CTTATGTTAA GCCCIG6CAG 1800 

GCAOGIGTTT ATTAAAATTC TGAATTTTGG 6GATTTTCAA AAGATAATAT TTTACATACA 1860 

CTGTATGTTA TAQAACTTCA TGGATCAGAT CTGGGGCAGC AACCTATAAA TCAACACCTT 1920 

AATATGCTX3C AACAAAATGT AGAATATTCA GACAAAATCG ATACATAAAG ACTAAGTAGC 1980 

CCATAAGGGG TCAAAATTTG CTGCCAAATG 06TATGCCAC CAACTTACAA AAACACTTOG 2040 

TTOSCAGAGC TTTTCAGATT GTGGAATQTT GQATAAOGAA TTAXAGAOCT CTAOTAGCTG 2100 

AAAT6CAAGA GCCCAAGAGG AAGTTCAGAT CTTAATATAA ATTCACTTTC ATTTTT6ATA 2160 

OCTGTCCCAT CTGGTCATGT QGTTGGC31CT AGACTGGTGG CAGGGGCITC TAGCTGACTC 2220 

GCACAGGGAT TCTCACAATA GCOGATATCA GAATTTGrGT TGAAGGAACT TGTCTCTTCA 2280 

TCTAATATGA TAGOGGGAAA AGGAGAGGAA ACTACTGCCT TTA GAAAATA TAAGTA AAgT 2340 

6ATTAAAGTG CTCAOGTTAC CTTGACACAT AGTTTTTGAG TCTATGGGTT TA&TTACTTT 2400 

AGATGGCAAO CATGTAACTT ATATTAATAG TAATnOTAA AGTTGGGTGG ATAA6CTATC 2460 

CCTGTTGCOG GTTCATGGAT TACTTCTCTA TAAAAAATAT ATATTTACCA AAAAATTTTG 2520 

TGACATTCCT TCTCCCATCT CTTCCTTGAC ATGCATTGTA AATAGGTTCT TCTTGTTCTG 2580 
AGATTCAATA TTGAATTTCT CCTATGCTAT TGACAATAAA ATATTATTGA ACTACC 

Seq ID NOt 36 Protein Bequencet 
Protein Accession #: NP_002630.1 

1 11 21 31 41 51 

111(11 

MDALQLAKSA FAVDLFKQLC EKEPL6NVLF SPICLSTSLS lAQVGAKGDT ANBIGQVLBP 60 

QIVKDIPFGF QTVTSDVNKL SSPYSLKLIK RLYVDKSLNL STEFISSTKR PYAKELETVD 120 

FKDKLEBTKG QIKNSIKDLT JXSSFESILM) NSVNDQTKIL WNAAYFVGK NMKKFPESET 180 

KECPFRUnCT DTKFVQMKHM EATFOIGHn) SXNCKIIELP FQNXELSMFI LLPXDVEDES 240 

TGLEKXEXQL HSBSLSQHTN PST^DlNAICVK LSIPKFKVBK MIDPKACLEN LGLKBIFSED 300 

TSDF8GMSET XGVALSHVIB KVCLBITEDG (SSIEVFGAS ZLQHKDEUIA DHPFZYZIRH 360 
NKTianZFFG KFC8P 



Seq XD HOi 37 DtZA sequence 

Nucleic Acid Accession 8: NM_0168583 

Coding sequence! 72*842 

1 11 21 31 41 51 

I I I I I 1 

GGAGTGGGG6 AGAGAGAC6A GACCAOGACA GCTGCTGAGA CCTCTAAGAA 6TCCAGATAC 60 

TAAGA6CAAA GATGTTTCAA ACTGGGGGCC TCATTGTCTT CTAOOGGCTG TTAfiCCCAGA 120 

CCATGGCCCA GrTTGGAGGC CTCCCOGTOC CCCTGGACCA GACCCTSCCC TTGAATGTGA 180 

ATCCAGCCCT GCCCTTGAGT CCCACAGGTC TTGCAGQAAG CTTGACAAAT GGCCTCAGCA 240 

ATGGCCTGCT GTCTGGGOGC CTGTTGGGCA TTCTGGAAAA CCTTCGQCTC CTGGACATGC 300 

TGAAGCCTGG A6GAGGTACT TCTGGTGGCC TCCXrTGGGGG ACT6CTTGGA AAAGTGA06T 360 

CAGTGATTCC TGGCCTGAAC AACATCATT6 ACATAAAGGT CACTGAOXC CAGCTOCTGG 420 

AACTTCGCCT TGTGCAGAGC CC7X3ATGGCC ACOGTCTCTA TGTCACCATC CCTCTCGGCA 480 

TAAAGCTCCA AGTGAATAOQ CCCCTGGTOG 6TGCAAGTCT GTTGAGGCTG GCTGTGAAGC 540 

TGGACATCAC TGCAGAAATC TTAGCTGTGA GAGATAAGCA GGAOAOGATC CACCTG G TOC 600 

TTGGTGACTG CACCCATTCC CCTGGAAGCC TGCAAATTTC TCTGCTTGAT GGACTTGGCC 660 

CCCTCCCCAT TCAAGGTCTT CTQGACAGCC TCACAGGGAT CTTGAATAAA GTCCTGCXTO 720 

AGTTX5GTTCA GGGCAAOG7G TGCCCTCTGG TCAATGAGGT TCTCAGAGGC TTOGACATCA 780 

CCCTGGTGCA TGACATTGTT AACATGCTGA TCCAOGGACT ACAGTTTGTC ATCAAGGTCT 840 

AAGCCTTCCA GGAAGGOGCT GOCCTCTGCT GAGCTGCTTC CCAGTGCTCA CAGATGGCTG 900 

GCCXATGTGC TGGAAGATGA CACAGTTGCC TrCTCTCXXSA GGAACCTGCC CCCTCTCCTT 960 

TCCOIOCAGQ OGTGTGTAAC ATOOCATGTG OCTCACCTAA TAAAATGGCT CTTCZTCT6C 1020 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 
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Seq ID NO: 38 Protein sequence: 
Protein Accession #: KP_057667 

1 11 21 31 41 51 

i I I I I I 

MFQTGGLTVF YGLLAQTMAO FGGLPVFUXi TLPUJVNPAL PbSPTGLAGS LIHALSHQUi 60 

SGGIiU}ILEH LPUSILKP6 CCTS6GLLGG IiLSKVTSVIP GUUIIOIKV TDFQLLBLGL 120 

VQSFDGaRLy VTIPLGIKLQ VDTPLVGASL LRIAVXISIT ABIIAV8DKQ ERIBLVbQ)C IBO 

TBSKSLQIS UiDGUaPtaPI OGLUJSLTGI UOWLPSLVQ SIVCPLVIIBV LBGLDIILVH 240 
OmOOiIHGL QFVIKV 



PCTAJS02/12476 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ZD NO: 39 DKA sequence 

Hucleic Acid Accession S: NM_004363.1 

Coding sequence: 115-2223 



CTCAGG6CAG 
TCCTGGAACT 
TCTCCCTCGG 
TCACTTCTAA 
TTCAATGT06 
TTTG6CTACA 
GTAATAGGAA 
CCCAATGCAT 
CAOGTCATAA 
GAGCTGCCCA 
GTQGCCTTCA 
CAGAGCCTCC 
TTCAATGTCA 
GCCAGGCX3CA 
TOCCCTCTAA 
TCTAACCCAC 
6A6CTCTTTA 
AACTCAGRCA 
CCCAAACCCT 
TTAACCrGTO 
CTCC066TCA 
GTCACAAGQA 
CACAGOGACC 
TCATACACCT 
GCACCTGCAC 
TTTATCTCCA 
GCCAGTGGCC 
CCCTCCATCT 
TOTGAACCTG 
GTCAGTCCCA 
AGAAAT6ACG 
GACCCAGTCA 
TOGTCTTACC 
CCGCAGTATT 
GCCAAAATCA 
6GCG6CAATA 
CTCTCAGCTQ 
TAGCAGCCCT 
TAAAGCATTT 
AOACTCTGftC 
AAATACAAAA 
TGAGGCAGGA 
ACT6CACTCC 
TCTQACCTOT 
AACTTTAATG 
TAATTAATTT 
TTCCCAGATT 
AAATATACTT 
AGACrTGGGA 
TCAATAAAAA 



11 

I 

AGGGAGGAAG 
CAAGCTCTTC 
CCCCTCCCCA 
CCTTCTGGAA 
CAGAGGGGAA 
GCT6GTACAA 
CTCAACAAGC 
CCCTGCTGAT 
AGTCAGATCT 
AGOOCTCCAT 
CCTGTGAACC 
OGGTCAGTCC 
CAAGAAATGA 
6TGATTCAGT 
ACACATCTIA 
CIQGACAQTA 
TOCCCAACAT 
CTGGCCTCAA 
TCATCACCAG 
AAOCTQAOAT 
GTOCCAOGCT 
ATGATGTAG6 
CAGTCATCCT 
ATTACOGTCC 
AGTATTCTT8 
ACATCACT6A 
ACAOCAGGAC 
CCAGCAACAA 
A06CTCAGAA 
G6CIGCAGCT 
CAAGA60CTA 
CCCTGGATGT 
TTTOGGGAGC 
CTTGGC3GTAT 
OGCCAAATAA 
ATTCCATAOT 
CSGGCCACTGT 
GOTGTAGTTT 
GCAACAGCTA 
CAGAGATCGA 
ATGAGCTGG6 
GAAT06CTTG 
AGTCTGGCAA 
ACTCTTGAAT 
AACTAACTGA 
CATG66ACTA 
TCAGGAAACT 
TTGTGAACAA 
AACTATTCAT 
TC TG CTCm 



21 

I 

GACAGCAGAC 
TCCACAGAGG 
CAGATGGTGC 
CCOGCCCACC 
GGAGGTGCTT 
AGGTGAAAGA 
TACCCCAGGG 
CCAGAACATC 
TGTGAATGAA 
CTCXIAGCAAC 
TGAGACTCA6 
CAGGCTGCAQ 
CACAGCAAGC 
CATCCTGAAT 
C3U3ATCA60G 
CTCTTGGTTT 
CACTGTGAAT 
TAGGACCACA 
CAACAACTGC 
TCAOAACACA 
GCAGCTGTCC 
ACCCTATGAG 
GAATGTCCTC 
AG6GGTGAAC 
6CTGATTQAT 
GAAGAACAGC 
TACAGTCAAG 
CTCCAAACCC 
CACAACCTAC 
GTCCAATQOC 
T6TATGTGGA 
CCTCTATGGG 
GAACCTCAAC 
CAATGGGATA 
TAAOGGOACC 
CAAGAOCATC 
CGGCATCATG 
CTTCATTTCA 
CAGTCTAAAA 
GACCATCCTA 
CTTGGTGGOG 
AACCOQGGAO 
CAGAGCAAGA 
ACAAGTTTCT 
CAGCTTCATG 
AAT6AACTAA 

AAATTGAGAC 
GAATATTTAT 
6TATAACAGA 



31 

I 

CAGACAGTCA 
AGGACAGAGC 
ATOOCCTGGC 
ACTGCCAA6C 
CTACTTGTCC 
GTGGATGGCA 
CCCGCATACA 
ATCCAGAATG 
GAAGCAACIG 
AACTCCAAAC 
6A0GCAACCT 
CTGTCCAATO 
TACAAATGTO 
GTCCTCTATG 
GAAAATCTGA 
GTCAATGGGA 
AATAGTGGAT 
GTCA06ACQA 
AACCC0GT6G 
AOCTACCTGT 
AAT6ACAACA 
TGTGGAATCC 
TATGGCCCAG 
CTCAGCCTCT 
GOGAACATGC 
GGAC7CTATA 
ACAATCACAG 
GTGGAGGACA 
CTGTGGTGGG 
AACAGGACOC 
ATCCAGAACT 
CCGGACAOCC 
CTCTCCTGCC 
CGGCAGCAAC 
TATGCCTGTT 
ACAGTCTCTG 
ATTGGAGTGC 
GGAAGACTGA 
TTGCTTCTTT 
GCCAACAT08 
06CA0CTGTA 
GTGGAGATTG 
CTCCATCTCA 
GATACC3VCT6 
AAACTOTCCA 
TGAGGATTOC 
TAAGCTATCC 
ATTTACATTT 
ATTGTATGGT 
AAAA 



Seq ID HO: 40 Protein sequence: 
Protein Accession #t NP_004354.1 



MB8PSAPPHR 
ELFGYSWYKG 
TIHVIKSDLV 
NNQSLFVSPR 
TISPLHTSYR 
AHNSDTGLNE 
QSLPVSPRLQ 
SPSYTYYRPG 
HSASGHSHT7 
LPVSPRLQLS 
PDSSYLSGAN 
ATGRNNSIVK 



11 
I 

WCIPWQRLUi 
ERVDOmQII 
KEBATGQFRV 
LQLSNGNRTL 



TTVTTITVYA 
1*SNDNRTLTL 
VNLSLSCHAA 
VKTITVSAEL 
NGNSTI.TLFN 



21 

I 

TASLLTFWNP 
GYVIOTQQAT 
ypELPKPSZS 
TLFNVTRNDT 
AASHPPAQYS 
BPPKPFITSN 
LSVTRMDV6P 
SKPPAOYSWL 



SXTVSASGTS 



VTRNDARAYV 
PSPQYSHRZH 
PGLSAGATVG 



31 
I 

PTTAKLTIES 
PGPAYSGRHI 
SNHSKPVEDR 
ASYRCSTQOT 
WFVNGTPQQS 
NSKPVEDEDA 
YEOGIQNELS 
ZDGinQQBTQ 
XFVBDKDAVA 

osio ssysA w 

GZPQOHTQfVL 
ZMIGVLVGVA 



41 

I 

CAGCAGCCTT 
AGACAGCAGA 
AGAGGCTCCT 
TCACTATTGA 
ACAATCTGCC 
ACOQTCAAAT 
GTGGTOSAGA 
ACACAGGATT 
GCCAGTTOT3 
CCGTGGAGGA 
ACCTGTGGTG 
GCAACAGGAC 
AAACCCAGAA 
GCCCGGATGC 
ACCTCrcCTG 
CTTTCCAGCA 
CCTATAOGTG 
TCACAGTCTA 
AGGATGAGGA 
GGTGG6TAAA 
6GACCCTCAC 
AGAACGAATT 
ACGACXXXaiC 
CCTGCCATGC 
A6CAACACAC 
CCT6CCA6GC 
TCTCTGCGGA 
AGGATGCTGT 
TAAATGGTCA 
TCACTCTATT 
CAGTGA6TGC 
CCATCATTTC 
ACTOGGCCTC 
ACACACAAGT 
TT6TCTCTAA 
CATCTGOAAC 
T6GTTG6GGT 
CAGTTGTTTT 
ACCAAGGATA 
TGAAACOOCA 
GTCOCAGTTA 
CAGTGAGCCC 
AAAAGAAAAG 
CACTGTCTGA 
0C3VAGATCAA 
T6ATTCTTTA 
ACTCTTACAG 
TCTOCCTATQ 
AATATAGTTA 



41 

I 

TPFNVAEGKE 
lYPMASIiLIQ 
DAVAFTCEPE 
VSARRSDSVZ 
TQELFZPNZT 
VALTCEPEIQ 
vnHSDPVlLM 
ELFISNITEK 
FTCEPEAQKT 
RSDFVTlfDVL 
FZAKZTPBNtf 
LZ 



51 

I 

GACAAAAOGT 
GACCATGGAG 
GCTCACAGOC 

fcrcanasccG 

CCAGCATCTT 
TATAGGATAT 
GATAATATAC 
CTACACOCTA 
GGTATACC06 
CAAOGATQCT 
GGTAAACAAT 
CCTCACTCTA 
CCCAGTGAGT 
CCCCACCATT 
CCAOOCAGCC 
ATCCACCCAA 
CCAAGCCCAT 
T6CAGAGCCA 
TGCTGTAGCC 
TAATCAGAGC 
TCTACTCAGT 
AAGTGTTGAC 
CATTTCCCCC 
AGCCTCTAAC 
ACAAGAGCTC 
CAATAACTCA 
GCTGCCCAAG 
GQCCTTCACC 
GAGCCTCCCA 
CAAT6TCACA 
AAACCGCAGT 
OCCCCCAGAC 
TAACCCATCC 
TCTCTTTATC 
CTTG6CTACT 
TTCTCCTOOT 
TGCTCTQATA 
GCTTCTTCCT 
TTTACAGAAA 
TCTCTACTAA 
CT066GAGGC 
AGATCGCACC 
AAAAGAA6AC 
QAATTTCCAA 
6CA6AGAAAA 
AATGTCTTGT 
CAATTTGATA 
TGGTCOCTOC 
TTGCACAAGT 



51 
I 

VI*LLVHNILPQ 
NIIQNirrGFY 
TQnATYLMHV 
LNVLYGPHAP 
VNNSGSyTOQ 
NTTYLWHVNN 
VLYGPDDPTI 
HSGLYTGQAN 
TYLHHVNGQS 
Y6POTPZZSP 
GTYACFVSHIi 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



204 



wo 02/086443 



Seq ZD NO I 41 DNA sequence 

Hucleic Acid Accession #t bm_006952.1 

Coding sequence: 11-793 ~ 

I 11 21 31. 41 51 

II I I t I 

AATCCCGACA ATGGOSAAAG ACAACTCAAC TGTTOGTTGC TTCCW3GGCC TGCTGATTTT GO 

TGGAAATGTG ATTATTGGTT GTTGCGGCAT TGCCCTGACT GCXKSAGTGCA TCTTCTTTGT 120 

ATC76ACCAA CACAGCXrTCT ACCCACTSCT TGAAGCCACC GACAAGQAZG ACATCTAT6G 180 

GCSCTOCCroO ATOQGCATAT TTGTGGGCAT CTGCCTCTTC TOC CTGTCr G TTCTAOGCAT 240 

TGTAS6CATC AIGAAGTCCA GCAGGAAAAT TCTTCTG6G6 TATTTCATTC TGAT6TTTAT 300 

AGTATATGCC TTTGAAGTGG CATCTTGTAT CACAGCAGCA ACACAAOGAG ACTTTTTCAC 360 

ACXXAACCTC TTCCTGAAGC AGATGCTAGA GAGGTACCAA AACAACAGCC CTCCAAACAA 420 

TGATGACCAG TGGAAAAACA ATGGAGTCAC GAAAACCTGG GACAGGCTCA T6CTCCAGGA 480 

CAATTGCTGT GGOGTAAATO GTCCATCAGA CTGGCAAAAA TACACATCTG CCTTCOGGAC 540 

TGAGAATAAT GAT6CTQACT ATCCCTGGCC TOGTCAATGC TGTGTTATGA ACAATCTTAA 600 

AGAACCTCTC AACCTGGAGG CTTGTAAACT AGGCGTGCCT GGTTTTTATC ACAATCAGGG 660 

CTGCTATGAA CTGATCTCTG CTCCAATGAA CCGACACGCC TGGGGGOTTG CCTGGTTTCG 720 

ATTTGCCATT CTCTGCTGGA C m TiX JGG T TCTCCTGGGT ACCATGTTCT ACTGGAGCAG 780 
AATTGAATAT TAAGAA 

Seq ZD 210: 42 Protein sequence: 
Protein Accession ff: NP_008883.1 

1 11 21 31 41 51 

I I i t I I 

HAKDNSTVRC FQGLIiZFiafV IXGCOGIALT AECIFFV8DQ HSLYPLIiEAT DKDOIYGAAH 60 

ZGZFVGIOiF CLSVUSZVGX HKSSHRILLA YFILMPIVYA FEVASCZTAA TQRDFFTPNL 120 

FLRQMLERYO !WSPPNNDDQ HKHNGVTKTW DRLMLQDHCC GVHGPSDWQK YTSAFRTENN 180 

DADYFWPRQC CVMNNLKEPL HLEACKUSVP GPyBNQGCyB LISGEMKRBA HGVAHFGFAI 240 
LCWTFHVLLG TMFYWSRIEY 



Geq ZD NO; 43 CKA sequence 

Nucleic Acid Accession #: Bos sequence 

Coding sequence: 63-2605 

1 11 21 31 41 51 

I 1 I I I I 

GCOGGACAGA TCTGCGC3GTA TCCTGGA6CC GGCCCAGTTG TGAACTAGGA GAGCTTTX^ 60 

ACCTCTGTCC CAAGCAAOAG AGATGAATGG A6AGTATAGA GGCAGAGGAT TTGGAG6AGG 120 

AAGATTTCAA AGCTG6AAAA GGGQAAGAOO TOGTGGGAAC TTCTCA66AA AATGGAGAGA 180 

AA6AGAACAC AGACCTQATC TGAGTAAAAC CACA86AAAA GGTACTTCTQ AACAAAOCCC 240 

ACAGTTTTTG CTTTCAACAA AGACCCX31CA GTC3VATGCAG TCAACATTGG ATOSATTCAT 300 

ACCATATAAA GGCTGGAAGC TTTATTTCTC TGAAGTTTAC AGCGATAGCT CTCCTTTGAT 360 

TGAGAAGATT CAAGCATTTG AAAAATTTTT CACAAG6CAT ATTGATTTGT ATGACAAGGA 420 

TGAAAXA6AA AGAAA6GQAA GTATTTTQGT ASATTTTAAA GAACTGACAG AAGOTGGTGA 480 

AGTAACTAAC TTGATAOCAG A7ATA0CAAC TGAACTAAGA GATGGACCTG AGAAAACCTT 540 

GGCTTGCATG GGTTTGGCAA TACATCAGGT GTTAACTAAG GACCTTGAAA GGCATGCAGC 600 

TGAGTTACAA GCCCAGGAAG GATTGTCTAA TGATGGAGAA ACAATGGTAA ATGTGCCACA 660 

TATTCATGCA AGGGTGTACA ACTAT6AGCC TTTGACACAQ CTCAAQAATQ TCAGAGCAAA 720 

TTACTATGGA AAATACATTG CTCZAAOAGQ GACAGIGGTT 08TGTCAGTA ATATAAA6CC 780 

TCTTTOCACC AAOATGGCTT TTCTTTGTGC TGCATGTG6A GAAATTCAGA GCTTTCCrCT 840 

TCCAGATGGA AAATACAGTC TTCCCACAAA GTGTCCTGTG CCTGTGTGTC GAGGCAGGTC 900 

ATTTACTGCT CTCCGCAGCT CTCCTCTCAC AGTTACGXtG GACTGGCAGT CAATCAAAAT 960 

CCAGGAATTG AT6TCTGATG ATCAGAGAGA AGCAGGTOGO ATTCCACGAA CAA7AGAATG 1020 

T6AGCTT6TT. CATQATCTTQ TGOAIAOCTO TGT O COO G GA QACACAeTGA CTATTACTGG 1080 

AATTGTCAAA GTCTCAAATG OOGAAQAAGG TTCTGSAAAT AAGAATGACA AGTGTATGTT 1140 

CCPrrTGTAT ATTGAAOCAA ATTCTATTAG TAATAGCAAA G6ACAGAAAA CAAAGAGTTC 1200 

TGAG6ATGGG T6TAAGCATG GAATGTTGAT GGAGTTCTCA CTTAAAGACC TTTATGCCAT 1260 

CCAAGAGATT CAA6CTGAA0 AAAACCTGTT TAAACTCATT GTCAACTCGC TTTGCCCTGT 1320 

CATTTTTGGT CAT6AACTT6 TTAAAGCAGG TTTGGCATTA GCACTCT7TG GAGGAAGOCA 1380 

GAAATA06CA GATGACAAAA ACAGAATTGC AATT03GGGA GACCCGCACA TCCTTGTTGT 1440 

TGGAGATCCA GGCCTAOGAA AAAGTC3WVT GCTACAG6CA GCQTGCAATG TTGCCCCAOG 1500 

TGG06TGTAT LT m xnWl' A ACACCAOGAC CACCTCTGGT CTGACGGTAA CTCTTTCAAA 1560 

AGATA6TTCC TCTG6AGATT TTGCTTTGGA AGCTGGTGCC CTGGTACTTG GTGATCAAGG 1620 

T A ' rt T G T G GA ATCGATGAAT TT6ATAAGAT GGG6AATCAA CATCAA6CCT TGTTGGAAGC 1680 

CATGO^CAG CAAA6TATTA GTCTT6CTAA GGCTCGTGTO GTTTGTAGCC TTCCTGCAAG 1740 

AACTTCCATT ATTGCTGCTa CAAATCCAGT TGGAG6ACAT TACAATAAAG CCAAAACAGT 1800 

TTCTQAQAAT TTAAAAATGG GGAGTOCACT ACTATOCAGA TTTGATTTGG TCTTTATCCT I860 

GTTAOVTACT CCAAAT6AGC ATC31TGATCA CTTACTCTCT GAACATGTGA TT6CAATAAG 1920 

AGCT66AAAG CAGA6AACCA TTAGCAGTGC GACAGTAGCT OGTATGAATA GTCAA6ATTC 1980 

AAATACTTCC GTACTTGAAG XA GI TTCT Q A GAAOCCATTA TCAGAAAGAC TAAAGGTGGT 2040 

TCCTGQAGAA ACAATAGATC CCATTCCCCA CCAOCTATTG AGAAAGTACA TTGGCTATGC 2100 

TCGQCAGTAT GTGTACCCAA GGCTATCCAC A13UUOTGCT aSAGTTCTTC AAGATTTTTA 2160 

CCTT6AGCTC 06GAAACAGA GCCAGAGGTT AAATAGCTCA CCAATCACTA CCAGGCAQCT 2220 

GGAATCTTTG ATTOGTCTGA CAGAG6CA0G AGCAAGGTTG GAATT6AGAG AGGAAGCAAC 2280 

CAAAGAAGAC GCTGAGGATA TAGTGGAAAT TA7GAAATAT AGCATSCTAG GAACTXACTC 2340 

TQAXGAATTT GGGAAOCTAO ATTT7GAGCG ATCCC3U3CAT GGTTCTGGAA TGAGCAACAO 2400 

GTCAACAGOO AAAAGATTTA TTTCTGCTCT CAACAAOGTT GCFGAAAGAA CTTATAAZAA 2460 

TATATTTCAA TTTCATCAAC TTCGGCAGAT TGCCAAAGAA CTAAACATTC AGGTTGCTGA 2520 

TTTTGftAAAT TTTATTOGAT CACTAAATGA CCAGGGTTAC CTCTPGAAAA AAGGCCCAAA 2580 

AGTTTACCAG CTTCAAACTA TGTAAAAGGA CTTCAOCAAG TTAGGGOCTC CTGGGTTTAT 2640 

TQGA6ATTAA AGCCATCTCA GTGAA6AXAT GGGTGCAOGC ACAQACAGAC AGACACACAC 2700 

AOUAChCAC ACACACACAC ACA£SU»CAC ACACACAGTC AAAZACTGTT CTCTGAAAAA 2760 

T6ATGT00CA AAAGTATTAT AATAGGAAAA AAGCATTAAA TA7AAIAAAC TAATTTAAGA 2820 
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AGfTGATAAAG TCTOCAGftTG CA6TA6CTCA CACTGTAATC AOSTGACTC A0GAG6CTGA 2880 

GGIGAGAGGA TTCCTTGAGG CCACGGTT06 AGACCAACCT T0G6CAACAT A6CAAGACCC 2940 

CATTTCTTAA AAAAAAAAAA AAAAAATTTA AACTTAGCTQ GGTATGGTGG CACATGCCIA 3000 

TAGTCTCAGC TACTTGTGAQ OCTGAGGCAO GAGGATTCTT TGAGCCCAGG AGTTTGAGGT 3060 

TACAGK3AGC CACAATCACA OCAATCACTG CACTCCA6CC TGGGCAATAA AGTAACTCTT 3120 

GACTCAAAAA AATAAAAAAA ATT6TAGTGG TA6CCATGT6 TTAATTGTTA AATAAATTCT 3180 

CC3UU«3GGCr AAAACTAAAT TACTTATAAA TTTTTTATAG TTGTATTTTT GAOCPGCCTT 3240 

TTATATGTAT GAATATTTCA TAGTTTTGCA TATCAQATGT AGGCATACAG ACAAATACAT 3300 

AAACCAATGA ATATATTACA TATTCTGTGT TCCAATAAAA CTTTATTTAT GGACACTAAA 3360 

ATTtXSAATTP CATAAAATTT TCOCATGTCA AGAATACAAA ATACTTGAGT TTTGTTTTTA 3420 

GCTATTXAAT AATAG6TCTC ATTTATTCCA CAGGCTGTA6 TTTGTAGTCT TGCTTGAAAC 3480 

AATAGAAACA QACTGATTAA GCA6GAGAAG TTTTTTGAAA GAATTTTGTT TG6CTCA0GG 3540 

AATTATTAGA AGGCAGGTGA ACX^GGAGGG TAAGCTTCCA GCAGCAATTT GTAAAAOCAT 3600 

GCCTTAGAAT TGGACTAAGQ AAGAAGCTGC TGACACTCCA CTGCCACACA GGGCACTGGA 3660 

AGAAAQTGCT GCTGCCTCCC TGCCX:CAOCT TTGCCACTTC TGCA6CA66A ATA06TAGAA 3720 

GAATGCCCCC ACC06CACCG GAACAGCAAC AAAAG6ATTC TGCATGAGAT 60CTC0CTAA 3780 

ATTGCTGAAT TCAAAAAAGA AGTTGCATAC AAAGACATCT 6ATTGAAAAA 6GGTAT6TTA 3640 

TATGCCCCTT TCATAGGCTG CTAGGGAGTT TTCCTGGTTC TACTTTCAfiG TGGTGGGATC 3900 

AATAAGAOai GAATTTCTCA TATGTTGTGA GAG6ATTCAA ATGTTACAGG GTTGCCAGOC 39S0 

AAACTATCAA TCATGIATAA ATCCAACAAA CACTTTGTAA CATACAAGAA CTC3W3GAAAT 4020 

GTGAACCATT GTTGGAGAAT CTACTAAAAT ACGGCTTGCC GCAAAOGAAG ATGAATG6AA 4080 

AATGTAAATA AAAAGAACTG GCAGTGTATA TCAGATGTTT AACTATAGGA CCAGAACTAA 4140 

GATOTGGAGA CTATTGO CA T AGAOCACAAT GTAAATTTTT AAGTGAGGAA GGAAAAATCA 4200 

CGAATCAAAA GGGGCCAG6T GCAGTGGCTC ACATCTATAA TCCCAGAGCT TTGGQAGTTC 4260 

6AGGCAGGA6 GATCACTTGA A6CCAGTTTT GAGACCAGCC TATGCAACAC ATTGAGACCC 4320 

TATCTCTACA AAAAATAGAT TAGCTGGGCA CGGTGGTGCA TGCCTATTGT CCTACCTACT 4380 

GTGGAGGCTO AAGTAGGAAA TCACTTGA6C COGAGAGTTT GAGGTTACAG T6AGCTATGA 4440 
TTAXAOCACr GCACTCCAGC CTGGGCAASA GAGCAAGAOC TTGTCTCTT 

8eq ID NO I 44 Protein sequence i 
Protein Accession «t CABS5276.2 

1 11 21 31 41 51 

i I I I I 1 

KNGEYRGRGF GRGRFQSWXR GRGGOfFSGK HRERE&RFDL SKTTGKR7SB QTPQFLLSTR 60 

TPQSMQSTLD RFIPYKGWKL YFSEVYSDSS PLIEKIQAPB KPFTRHIDLY DKDBIERKGS 120 

ILVDFKBLTB GGEVTNLIPD lATELXlDAPE KTliACMGIiAI BQVLTKDLER HAAELQAQEG 180 

LSMDGETMTO VFHZBARVYN YEPIiTQLXNV RAHYVGlOriA LRGTWRVSN IKPLCTKHAF 240 

LCAAOGBIQS FPLPDGKYSL PTKCPVPVCS GRSFTALRSS PLTVrmDWQS IKIQELNSOD 300 

QREAGRIPRT lECELVHDLV DSCVPGDTVT ITGTVKVSMA EBGSRNKNDK CMPLLYIEAN 360 

SISNSKGQKT KSSEOGCKHG MU4EFSLRDL YAIQSIQAEE NLFKLIVKSL CPVIFGHELV 420 

KAGLALALFG GSQKYADDKN RIPHUSFBI LWGDPGLGK SQMLOAAOIV AFRGVYVCGV 480 

TTTTSGLTVT ZiSKDSSSQ)? ALEAGAIiVLG DQOIOSIDBF 0KM6HQHQAL LEAMEQQSIS 540 

LAKAGWCSL PAHTSZZAAA NFVGGBYNKA KTVSEHIiKMG SALLSRFDLV FILLDTeNEB 600 

HDHLLSEHVl AlRAGKQRTl SSATVARMNS QDSNTSVLEV VSEKPLSERL KWPGETIDP 660 

IPHQLLRKYI GYAHQYVYPR LSTEAARVLQ DFYLBLRKQS QRLNSSPITT RQLESLIRLT 720 

BARARLELRS EATKEDAEDI VEIMKYSMUS TYSDEFGNZJ) FERSQHGSGN SNRSTAKRFI 780 
SAUaNVAERT YNNIFQFBQL RQZAXELNIQ VADFBHFIGS UlDQGYUiKK GPKVYQLQTM 



Seq ID NO: 45 DNA sequence 

Nucleic Acid Accession fit NM_O0S416.1 

Coding sequence! 149.. 658 

1 11 21 31 41 51 

I I I I I I 

ACCA6AT0CC AGAGGCTGAA CACCT06AGC TTCTCTGCAC AGCAGATGAT CCCTGAGCAG 60 

CIGAAGACCA GAAAA6CCAC TAAOACrTTC TQCTTAATTC AGGASCTTAG AGGA TTCT TC 120 

AAAGAGTGTG TCCACGATOC TTTQAAOCAT GAOTTCTTAC CA0CA6AAGC AOACCTTTAC 180 

CCCACCAOCT CAGCTTCAAC AGC3W3CAGGT GAAACAACCX: AGCCAGCCTC CACCTCAGGA 240 

AATATTTGTT CCCACAACCA AGGAGCCATG CCACTCTUU^ GTTCCACAAC CTGQAAACAC 300 

AAAGATTCCA GAGCCAGGCT GTACCAAGGT CXXTGAGCCa GGCTGTAOCA AOGTCCCTGA 360 

GCCAGGCTGT ACCAAGOTCC CT6A6CCAGG TTGTAOCAAG GTCCCTGAGC CAGGCTGTAC 420 

CAAGGTCCXn' GAGCCAGGTT GTACCAAGGT OCCIGAGCCA GGCTACACCA AG6TCCCTGA 480 

ACCAGGCAGC ATCAAGGTCC CTGAOCAAGG CTTCATCAAG TTTCCTGAGC CAGGTGCCAT 540 

CAAAGTTCCT GAGCAAGGAT ACACCAAAGT TCCTCTGCCA GGCTACACAA AOCTACCAGA 600 

GCCATGTCCT TCAAOGGTCA CTCCAGGCCC AGCTCAGCAG AAGACCAAGC AGAAGTAAT? 660 

TGGTGCACAG ACAAGCCCTT GAGAAGCCAA CCAOCAQATG CTGGACACCC TCTTCCCATC 720 

TGTTTCTGTQ TCTTAATTGT CTGTAQACCT TGTAATCAQC ACATTGTCAC CCCAAGCCAT 780 

AGTCTCTCTC TTATTTGTAT CCTAAAAATA CGTACTATAA AGCTTTTGTT CACACACACT 840 

CTGAAQAATC CTGTAAGCCC CTGAATTAAQ CAGAAAGTCT TCATGGCTTT TCTGGTCTTC 900 

GGCT6CTCAG GGTTCATCTO AAGATTGGAA TGAAAAGAAA TGCATGTTTC CTGCTCTTCC 960 
CTCATTAAAT TGCTTTTAAT TCCA 



Seq ID HO: 46 Protein sequence: 
Protein Accession #: NP_005407.1 

1 11 21 31 41 51 

I I I 1 I 1 

MSSYQQKQTF TPPPQUQQQQ VKQPSQPPPQ BIFVPTTKEP CBSKVPQPGH TKIPEPGCTK 60 

VPEPGCTKVP BPGCTKVPEP GCTKVPEFGC TKVPEPGCTK VPEPGYTKVP EPGSIECVPDQ 120 

GFZKPPBPGA IKVPEQGYTK VEVPGYTKLP BPCPSTVTPG PAQQXTKQR 

Seq ID NO: 47 DNA sequence 

Nucleic Acid Accession St Eos sequence 
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1 11 21 31 41 51 

I I I I I I 

GGGTOGTCjTG CAiGGO?rCCC OGG6CTGTGG AXAATTAGAC AOGTTCTTOC CTCATTGCCC 60 

AAGGCTOGTT AGAATT06CC CTAGAGCTGT ATCATGTATT TTCTTTCAAA TTAACTTTGC 120 

TTGCAATTAA GCTTAGGGAA COVGCAACAA AAGCAAACTT GGCCOGAGGT OGTTCACOGC 180 

GAAAATGGAT TAGAGAAACT TCTTCOOOGA TTTAAGGGGA AAGATTCCT6 OBGOC A GOGC 240 

TTT6GGGAAA GTGOCCXXSAC 06CA6AG60G AOGACAGGGG AGCAGGAAGC T6CTCA06GT 300 

AGTOGGCGTT GGQG6CA606 GTGGCCTTCC TCATCIGGGC GATGTGGGCT OCTAGAAGAO 360 

TAAGGATAAC ATCCTGGAAA TGACTTCTGT ACGGTTTGAG CCCAACTGCA CACTCATGAC 420 

TTGGA6CTGC CCT O TG G AGT TACAGTTTAC CAAAOlCMT CAT6AACATA ATCTCATTTA 460 
CTAAAAACTT TGTGAGAATT TTCTTTTACT AAAATTTTTT CTTATTACAA A 

Seq ZD NO: 48 OKA sequexkce: 

nucleic Acid Accession $i CAT cliister 



1 11 21 31 41 51 

1 I ill! 

TTCCAAATTT TTTTTTTTGT AATAAQAAAA AATTTTAGTA AAAfSAAAATT CTCACAAAGT 60 

TTTTAGTAAA TGAGATTATG TTCATGAATG TGTTTGGTAA ACTGTAACTC CACAGG6CA6 120 

CTCCAAGTCA TGAGTGTGCA GTTGGGCTCA AACCGTACAG AAGTCATTTC CAGGATGTTA 180 

TCCTTACTCT TCTCGGAGCC CACATCGCCC AGATGAGGAA GGCCACCGCT GCCGCCAA06 240 

C0GACTACXX5 TGAGCAGCTT CCTGCTCCCC TGTOGTOGCC TCTGCGGTCX5 GGGCACTTTC 300 

CCCAAAGOGC TGGCGGCAGG AATCTTTCXX: CTTAAAT06G GGAAGAASTT TCTCTAATCC 360 

ATTTPOGCJGG TGAACGACCT OGGGCCAAGT TTGCTTTTGT TGCTGGTTCC CTAAGCTTAA 420 

TTGCAAGCAA AGrTAATTTG AAAGAAAATA CATGATACAG CTCTAGGGOG AATTCTAACG 480 

AGCCTTOGGC AATGAGGGAA GAACGTGTCT AGTTATCCAC AGCC0GGG6A CGCCT6CACA 540 
06A0GCT 

Seq ID HOt 49 DMA sequence 

Nucleic Acid Accession lit CAT cluster 

1 11 21 31 41 51 

I I 1 I I I 

TCTTTCTTCT GCTGCTCCJTT TGTCTCTCCT GTGCTCTTCT TCTTTCTTTC CCTCGCOGCT 60 

CCTGCOGACC TCTGTTGTCT CTTCTCTGAT GGCGGC3GGGC QGQAGAAGCT GAOOGGTGAG 120 

AC06TAGACC CGAAACCATT GGGTGTCACA AGCCGGTCGC CGGCXTTTTT G66AGAACCC 180 

QACACATQCA GACQU3TTTT OCTGGAAQTG CATGACCAT6 TTATTACIAT G6GC0GCCTC 240 

CCCAACCAAA GTGTTTAAAA CTTTTTAGGO CACCCCCAAA ATTTTTTTTT TTTTTTTTTT 300 

TTCATTTAAA AAACTCTAAT ATTTATATTA AATACAAAGA TACCCAAACC CTTTATQCTT 360 

CTTTCTCTGA TCTGTGTCTT TTTTCTTTGA CAGCATCTCC ATTTTTTTTC TGCTGCTTCA 420 

TCGCTGTAGC CATGG6AATC CGTTTCATTA TTATG6TAGC AATAT6GAGT GCTGTATTCC 480 

TAAAGAAACT GACACAGGAG AATCACTTGA ACTTG6GA00 CAGAGTTTGC A&TGAGCCGA 540 

GATTGAACCA GTGCACTCCA GCCTTGGCAG CGGAGCAAGA TTCTGTCACA G TT CCTGAAO €00 

TGCTGGTATC GTCCTGCAGC CCCATCCTOG GTTCCATTGC GCTGCCAGGC AGGGTGCTGG 660 

GAOGTGGGGA GAOCTGGTCT ATATATCCGG GTGAAGCTCA GCTGTGGCAC ACCTTGGATG 720 

CCGGGTCrCT CCTG6CCC0G GG6ACCTAGT ATTTTTGCCA OSAGTGTACA CCAAACAAAG 780 

GAGACAGCAT CATTTATGAG CCTQCAGGAT CCACCCTACT GCZGTATCCA 6TTTCCATT6 840 
ACTO 

Seq ID NO: 50 DNA sequence 
Nucleic Acid Accession #t li05187 
Coding sequence: 1991.. 2260 

1 11 21 31 41 51 

I I 1 1 I i 

CT6CAGGGAG GCAG6TAGAA AAGGCTTTTG OGTTTTCAGG T6GQ6GGCAQ TCTAGCCTGA 60 

TCAGAAAGGA 6GAAAAGGCC AGOGCAGATO TCTG6GTGGA GT6AAGGGAA AAAGTGATCC 120 

CAGAAGAAGQ ATTAGCCCCT GAAA6TCCCT GAAQTAQ6AG AAGGGTAAAG GTGTGGTTGG 180 

TGAAGQAAAG CAG6TTTTCC CAOATTAGCA ACCAGTCAG6 GGGAGGAAGG TGAGAGTGGG 240 

AGAGTCATAA GTAAATTATT CTGAATGTGT GTAGTTTAAT GGAATTGGGA AAAAQATGGQ 300 

GGAAATOGAT GGAAGGTCTT G6ACTCTGAG ACAAGGGGTC TATAATCA6T CCATTTCATT 360 

ATTTCTAGCT TCCACCTTCA CCAAGGCAGA CAAGGAGGGC 0CACCTCA8C TCCTCTSCTC 420 

CCCCTOCCTT TCCCACCTAT TCATGTGTGC AA6AGT6CCC TGTCCCACAG AACA06GGGA 480 

ACAACCATCT CAATGACAAG GACAGCAGGT GGCAAGGCTC AACAGGACTC AGATGTCOCC 540 

CCAGGGTTAA CTCATGAAAC CCTCCATGAA GCCTGCTGCT CACCCCTCCC TCAAGGCAAG 600 

CCCTGCACCr GGGTCTGAGG ATGAGGGT6G CA6T6AAAAT TAGGCCAGTG ACATCATTTT 660 

CA6CCAGCTA GIGCCAAAAA ATATCAGGTO GTOTTCATCA AATAAGCOGA GCCAACCOOT 720 

QATGAGGATG GTAGTGTOAO TCATQT6TGA CAGGT6AGGA ATGAAAACAG AGTGC006A6 760 

AOCTTCTATT TCCTTGAGGC AGGGCTCATT CA7C7TATAA AAGCCAGCTG GCCATTGCCT 840 

TCACACX3UVA CCCAAGGGAC CACACAGCCC ATTCTGCTCC GTATAOCAGG TAAGTCTCTG 900 

ATTGCAACAA AC7GGCAATT CTAGTGTACT TTTTCATTAT TAGAAATTAG CTAAAGGCAA 960 

ATATGTGTAA GCAGGTTAAT CCAGGGTTTC AATGQGAGAT A6AGAATAOT GGAAXATCTT 1020 

TATTTTAAOT TAAATTACAG TCTQGATTTG AAAGGACCTT AGAGKIGGTT AGGGCTCCX3V 1080 

CCTCAGTAGA TAGTCATTGA ACTQG6A3TC CTGGAGAAGA TTGTTCAAAT GCCCATQGGA 1140 

AGTTCATAGC AGAACTAGAA CTCAGCCCAG AGCACTCTCA GTAACACTGC AATTTCCCCC 1200 

TGACAAGATA TTTATAGAAA TTTTAATTTA TTAGATGGAT CTCTACTGAO CATTTATTCC 1260 

ATTTAA6GCA GTATGCTAGG CACT T T GG AC AAATCftATGC OCTAACGIAC TTACTTAACA 1320 

AACATAAAAC CTA6CAG6AA G6TAATACAT ATAXAXAAAT AA2CTGAAATG CAAA8TAGAT 1380 

AGTAATTSGC ATGAOGGAGA T6GGCA6AGA AGGGCTGTGC ACTTTTGGGA GACTTXTCTCA 1440 

AGGAGAOCTC TAGGGTGTCA AGTGATGTGA GCTATGATGG AGGG6TATTT GGACAAGCS^ 1500 

AGAT6GGAAG AAAAGCATTT GGAAG6GACT GTGTAAGCAC AGACCAGAAG CAAAACCATA 1560 

GAGGCTTAGA TGAATATAAA GCCATCCTAT AAGTCACAGO CTTTCTACAT GGTACTAGGA 1620 

GA6GAAAGTG GTCTGATGCC AT7TTCCAAA AQACCTAATA TGOGGACCTC ATGTCCCTCA 1680 

GAAGCCAGCT TTAGTAGC5GC ATTTTTCCAG AACAGATATA AGGTGCCTTG GGTAGGAAGG 1740 

GAGCCAAGAA GAGAACTCCA ATAAAATGSl GCAGAAGAAA 7TGCCTTTTA 6CTCCTCCTC 1800 

TTCAAAGGGC CTGAAAATTA TCCAAGCTTA TTTCATTTTT AAATGTAATG GGGGAGCEAA 1860 
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GGGACATGAA AG6CTTTCTC TTCTAAA6G6 
TTGTATCCA.T CTTTCTTTAA TTGAATCftCT 
CATTTGAAGC ATGAATTCTC ASCAGCAGAA 
GGAGCAGCAG GTGAAACAAC CTT6CCAGCC 
CAAG6AGCCC TGCCAACCCft A6GTGCCTGA 
CCAGC0CAA6 ATTCCAGAGC CCTGCX»GCC 
CACTCCAGCA CCAGCCCAGC A6AAGACCAA 
TTGAGGAGCT GGCCACTGGA TACTGAACAC 
GCCrATTGAC CCTGCAGTTA GCATCCTGTC 
CTAAAAAGAT GTCCCTTACC CTCATTCTG6 
GTCTCftCTGA CTGAGCTAjST CllCrmriG 
AGGTCAAGTG ACCATGCCTA 6 

Seq ID NO I 51 Protein sequence} 
Protein Accession 8:AAC2683B 



PCTAJS02/12476 



TCCTGAAATA 
GTGTCAGCTT 
GCAGCCTTGC 
TCCACCCXAG 
6CCCTGCCAC 
OUVGGTGOCT 
GCAGAAGTAA 
CCTACTOCAT 
AOCCTGAATC 
A6GCT0CTGA 
CT0GGGT6CA 



AAATCTGTTT 
TCTGTCTCTA 
ACCCCACrCC 
GAACCATGCA 
CCCAAAGT6C 
GAG00CT6CC 
TOTOGTOC A C 
TCTGCTTATQ 
ATAATOGCTC 
GCCTCTGOGT 
TTTGAGGATG 



G6CATTGAAT 
6AAAAAAACA 
CTCAGCCTCA 
TCCCCAAAAC 
CTGAGCOCTG 
CTTCAAOGGT 
A6CCATGC0C 
AATCCCATTT 
CTTTGCACCT 
AAGGCT6AAC 
GATTTGOG^ 



1 
I 

CAATACASCT 
GCTGGACTGC 
TGAGG6CCAG 
AG6CASCTGT 
TCAATGGACA 
OGCMOAGCC 
TCCGGTGCGC 
TCAAGAAGTG 
OGGTCCTTGC 
T6CTGCCCTT 
GA8CTG0CTC 



11 
I 

AAG6AATTAT 
ATAAAGATTG 
CAGCTTCTTG 
CAOGGGAGTT 
AQATCCOGTT 
AGTCAAAGGT 
CATGTTGAAT 
CTGTGAAGGC 
TGCACCTGTG 
COCCTTOOCA 
TCTCATCCAC 



21 
I 

CCCTTGTAAA 
GTATGGCCTT 
ATCGTGGTGG 
CCTGTTAAAG 
AAAOQACAAG 
CCAGTCTCCA 
CCCCCTAACC 
TCTTGOGGGA 

ccx3rrccccAS 

CACXGTCCAT 
TTTCCAATAA 



31 
I 

AGCTCTTAGC 
TGTTCCTCAT 
GTCAAGACAC 
TTTGAGTTAA 
CTAAGOCTGG 
GCTGCTT6AA 
TGGCCTGTTT 
A6CTACAGGC 
TCTTOCTCCC 
A 



41 

I 

CCGCCCTGGA 
CAAACACCTT 
OGCTG0GAO8 
TGTCAAAGGC 
AGGTCAAGAT 
CTCCTQCCCC 
AGATACTGAC 
OGTTCCCCAO 
CCCATCTGGT 
ATTOUSGATQ 



51 
I 

GCCAGGCCAA 
CCTGACACCR 
CTGGTTCTAG 
CGTGTTCCAT 
AAAGTCAAAG 
ATTATCTTGA 
TGCCCAGGAA 
TGAAGGGAGC 
CCTAAGTCCC 
CCCA06GCT0 



Seq ID KOt 53 Protein sequence t 
Protein Accession #i IIP_002629.1 



11 



21 



31 41 51 

1111)1 
MRASSPLIW VPLIAGTLVL EAAVTGVPVK GQDTVKGRVP FBGQDPVHGQ VSVKGQDKVK 
AQEPVKGPVS TKPGSCPIZL IRCAMIiNPFN RCLXDTDCPG IKKCCB3S0Q MACFVPQ 

Seq ID MOi 54 DSA sequence 
Nucleic Acid Accession «t liM_019616 
Codixsg sequence: 75-584 " 



GGCAOQAGCC 
GAGACAACCA 
ATCAATCAAT 
GCCTTCAGGQ 
TT6CTGTTAT 
ATTTGGGAAT 
CATTGCAGCT 
CCTTCCTTTT 
OGGACTGGTT 
GGAAGTCATA 
GCAGCTTGOT 
AQTQTCATTT 
7AATGAA6AA 
GGAGASCTGQ 
CTGCATGAGT 
TQAAGATGCT 
CTCTGTTTCT 
CCAATATACC 
TAATTCTTGT 
AATAAACTTT 



11 

I 

ACGATTCAQT 
CACTATGAGA 
GTGTAAACCT 
TCAGAAOCTT 
CACAT6CAAG 
CCAGAATCCA 
AAAAGAGCAG 
CTACOGTGCC 
CATT6CCTCC 
CAACACTGCC 
CTTTGTCTTA 
TCACX3CTGGT 
GAAGCAATTA 
GTGGTATAAG 
GACTTTAAGA 
TCAGAGCTCA 




21 

I 

CCCCIGGACT 
GGCACTCCAG 
ATTACTGGGA 
GTGGCAlGTTC 
TATCCA0A06 
GAAATGTGTT 
AAGATCATGG 
AA6ACTGGTA 
TCCA AQAGRG 
TTTGAATTAA 
AAGTTTCTGG 
GCTGAGACAG 
CTTCATAGCA 
OCTGTCCTCT 
CTCA AAGA CC 
TG0G06TTAC 
ATTCCCTCTT 
TAA TAGAACC 
ATCATTmC 
ATAATAAAAA 



31 
I 

GTAGATAAAG 
GASA06CTGA 
CTATTAATGA 
CAOQAAGTGA 
CTCTTGA8CA 
TGTATTGTGA 
ATCTGTATGG 
GGACCTCCAC 
AOCAGCCCAT 
ATATAAAT6A 
TTCCCAAT6T 
GGGCAAGGCT 
ACTGAAGAAC 
CAA0CTGGTG 
AAACACZGAO 
CCAOGATGGC 
GGGATGATAT 
TTCTTA6CAT 
TCCTAATTGT 
AAAAAAAAAA 



41 



ACCCTTTCTT 
TGGTGGAGGA 
TTTGAATCAG 
CAGTGTQACC 
AGGCA6A6GG 
GAAG6TTGGA 
CCAACCCSAG 
CCTTGAGTCT 
CATTCTGACT 
CTGAACTCAG 
GTTTTOGTCT 
GCTGTTATCA 
AG6AT6T6GC 
CTGTGTA06C 
Cm'C r i'Ci' A 
ATGACTAGCA 
CATCCAGTCT 
TAA6ACCTTG 
AATGTGIAAT 
AAA 



51 

1 

GCCAGGTGCT 
AGGGCOGTCT 
CAAGTGTGGA 
OCAGTCACTO 
GATCCCATTT 
GAACAGCCCA 
CCCGTGAAAC 
GTGGCCTTOC 
TCAGAACTTG 
CCTAGAGGTG 
ACATTTTCTT 
TCTCATTTTA 
CTCAGAAGCA 
CACAAGGCAT 
GGGGTGGGTA 
CAGAGCTGAT 
TTATATGTTG 
TAAACAAAAA 
CTTAAAOTTA 



Seq ZD NOt 55 Protein sequence: 
Protein Accession #: NP_062564 

1 11 



21 31 41 51 

1111 
KRGTPGDADG GGRAVYQSMC KPITGTINDL NQQVHTbQGQ NLVAVPRSDS VTPVTVAVIT 
CKYFEALEQG HGDPXVLGIQ NFEMCLYCEK VGBQPTLQIiK BQKXMDhYGQ PEFVKPFZiFY 
BAKIGRTSTL SSVAFPDNFZ A8SKRDQPZI LTSEbGKSYH TAFEUTZND 

Seq ID NO: 56 DHA sequence 
Nucleic Acid Accession «: im_00312S 
Coding sequencer 65-334 



1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 



1 11 21 31 41 51 

I I I I I I 

MNSQQOKQPC TPPPQPQQQO VKQPOQPPPQ EPCIFKTKEP CQFKVPEPCB PKVPBPOQPK 
XPBPOQPKVP EPCPSTVTPA PAQQKTKQK 



Seq ID NO: 52 DNA sequence 

Nucleic Acid Accession S: NM_002638.1 

Coding sequence: 120-473 " 



60 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 



60 



60 
120 
160 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



60 
120 
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1 11 21 31 . 41 51 

I 1 I 1 1.1 

AGCAGTTCTA AGGGACCATA CAGAGTATTC CTCTCTTCAC ACCA3GACCA GCCACTGTTG 60 

CAGCATGAGT TCCCAGCAGC AGRAGCAGCC CTGCATCCCA CCCCCTCAGC TTCAGCAGCA 120 

GCAGGTGAAA CAGCCTTGOC AGCCTCCAOC TCACGAAOCA TCCATCOCCA AAACCftASCSA 180 

GCCCTGCCAC CCCAAGGT6C CTGAGOCCTG CCACCOCAAA GTGOCTGAGC CCTGCCftGCC " 240 

CAAGCTTCCA GAGCCATGCC ACCCCAAGGT GCCTGAGCCC TGCCCTTCAA TAG TCAC TCC 300 

AGCACCAGCC CAGCA6AAGA CCAAGCAGAA GTAATOTGGT CCACAGCCAT GCCCTTGAGG 360 

AGCOGGCCAC CAGATGCTGA ATCCCCTATC CCATTCTGTG TAT6AGTCCC ATTTGCCTTG 420 

CAATTAGCAT TCTGTCTGCC CCAAAAAAGA ATGTGCTAT6 AAGCTTTOrT TCCT^AOIC 480 

TCTGAGTCTC TGAATGAAOC TGAAGGTCTT AGTACCAGAG CTASTTTTCA GCTGCTGAGA 540 

ATTCATCTGA AGAOAGACTT AAGATGAAAG CAAATGATTC AGCTCOCTTA TACCCCCATT 600 
AAATTCACTT TCAATTCCA 



Seq ZD NO: 57 Protein sequence t 
Protein Accession #: NP_003116 

I 11 21 31 41 51 

I i I I I I 

MSSQQQKQPC IPPPQLQQQQ VKQP OQPP PQ EPCIPKTKEP CBPKVPEPCH PKVPEPOQPK 60 

LPEPCHPKVP EPCPSIVTPA PAQQKTKQK 

Seq ID NO: 58 DNA sequence 

Nucleic Acid Accession #t NM_001793.2 

Coding sequence i 71-2560 

1 11 21 31 41 51 

I i I I I I 

AAAGGGGCAA GAGCTGAGCG GAACACCG6C CC6CCGTCGC GGCAGCTGCT TCACC CCTC T 60 

CTCTGCA6CC ATGGGOCTOC CTOGTQGACC TCTCOCGTCT CTCCTCCTTC TCCAGGTTTG 120 

CZG6CT0CA6 TG0606GCCT CCGAGCGGTG COGGGOGGTC TTCAGGGAQS CTGAAGTQAC 180 

CTTGGAGGOQ GGAGGCGOGG AGCAGGAGCC CGGOaGGOG CTGGQGAAAG TATTCATQGG 240 

CTGCCCTGGG CAAGAGCCAG CTCTGTTTAG CACTGATAAT GATGACTTCA CTGTGCGQAA 300 

TGGCGAGACA GTCCAGGAAA GAAGGTCACT GAAGGAAAGG AATCCATTGA AGATCTTCCC 360 

ATCCAAAOGT ATCTTAC6AA GACACAAGAO ACZATTGGGTG GTTGCTCCAA TATCTGTCCC 420 

TGAAAAT6GC AA6QGTCCCT TCCCCCAGAG ACTCSAATCAS CTCAA6TCTA ATAAA6ATA0 480 

AGACACCAAG ATTTTCTACA 6CATCACGGG GCCG6GGGCA GACAGCCCCC CTGAGGGTGT 540 

CTTCGCTGTA GAGAAGGAGA CAGGCTGGTT GTTGTTGAAT AAGCCACTGG ACCGGGAGGA 600 

GATTGCCAAG TATGAGCTCT TTGGCCAOGC TGTGTCAGAG AATQGTGCCT C 3VSTGG AGGA 660 

CCCCATGAAC ATCTCCATCA TCGTGAOOOA CCAGAATOAC CACAAOCOCA AQTTTACCCA 720 

GGACACCTTC OGAGGGAGTG TCTTAOAGGQ AGTOCTACCA GGTACTTCTG T6ATGCA0GT 780 

GAC3VGCCACG GATGAGGATO ATGCCATCTA CACCTACAAT GGGGTGGTTG CTTACTCCAT 840 

CCATAGCCAA GAACCAAAGG ACCCACACX3A CCTC31TGTTC ACCATTCACC GGAGCACA66 900 

CACCATCAQC GTCATCTCCA GTG6CCTGGA CCGGGAAAAA GTCCCTGAGT ACACACTGAC 960 

CATCCAOGCC ACAGAC31TGG ATQGQQA080 CICCAOCACC A0G6CAGTQ6 CAGTA6TGGA 1020 

GATCCTTQAT 6CCAAT0ACA ATOCTCOCAT OTTTGACOCC CftGAAGTAOS AGGCOCATCT 1080 

GCCTGAGAAT GCAGTGGGCC ATGAGGTGCA GAGGCTGAOG (JTCACTGATC TGQACGCCCC 1140 

CAACTCACCA GCGTGGOGTG CCACCTACCT TATCATGQ6C GGTGACGAOO GGGACCATTT 1200 

TACCATCACC ACCCACOCTG AGAGCAACCA 6GGCATCCTG ACAACCAGGA AGG GTTTGG A 1260 

TTTTGAOGCC AAAAACCAGC ACACCCTGTA 08TTGAAGTG ACCAAC38AG6 CCCCTTTT6T 1320 

GCTGAAGCTC CCAACCTOCa CAGCCACCAT AOTOCTCCAC GTGGA6GATG TGAATGAGSC 1380 

ACCTGTGTTT GTCCCACCCT CCAAAGTCGT TGAGGTCCAG GAGGGCATOC CCACTGGGGA 1440 

GCCTGTGTGT GTCTACACTG CAGAAGACCC TGACAAGGAG AATCAAAAGA TCAGCTACOQ 1500 

CATCCTOAGA GAOCCAGCAG GGTGGCTAGC CATOGACCCA GACAGTGGGC AGGTCACAGC 1560 

TGTGGGCACC CTGGAC08TG AGGATGAGCA GTTTCTGAGG AACAACATCT ATGAAGTCAT 1620 

GGTCTTGGCC ATOGACAATG BAAGCCCTCC CACX»CTGOC A06QQAACCC TTCTGCTAAC 1680 

ACTGATTGAT GTCAATGACC ATGGCCCAGT CCCTGAGCCC CGTCAGATCA CCATCTGCAA 1740 

CCAAAGCCCT GTGCGCCAGG TOCTGAACAT CAOGGACAAG GAOCTGTCTC CXX:ACACCTC 1800 

CCCTTTCCAG GCCCASCTCA CAGATCACTC AGACATCTAC TG6A0GGCAG AGGTCAAOGA 1860 

GGAAGGTGAC ACAGTOG TC T TST CC CIGAA GAAGTTCCTO AAGCAdOATA CATATGAOGT 1920 

GCACCTTTCT CTGTCTGACC ATGGCftACAA AGAGCW3CTG ACOOTOATCA OGGCCACTGT 1980 

GTGOGACTGC CATGGCCATG TC3GAAACCTG CCCTGGACCC TGGAAGGGAG GTTTCATCCT 2040 

OCCTGTGCTG GGGGCTGTCC TGGCTCTGCT GTTCCTCCTG CTGGTGCTGC TTTTGTTGGT 2100 

GAGAAAGAA6 OQGAAGATCA AGGAGCCCCT CCTACTCCCA 6AAGATGACA CCCGTGACAA 2160 

OGTCTTCTAC TATGGOGAAG AGGGGGGTGQ GSAAOAGGAC CAOCSACTATQ ACATCACOCA 2220 

GCTCCACCGA GGTCTGGAGG CCAGGCCG6A G6TGGTTCTC OGCftATGAOG TGG^OCAAC 2280 

CATCATCCCG ACACCCATGT ACCGTCCTCG GCCAGCCAAC CCAGATGAAA TCG6CAACTT 2340 

TATAATTGAG AACCTGAAGG CGGCTAACaC AOACCCCACA GCCCOGCCCT ACGACA CCCT 2400 

CTTGGTGTTC 6ACTATGAQG GCAGOQGCTC OGACJGCOGCG TG0CT6AGCT CCCTCACCTC 2460 

CTCCGCCTCC GACCAAGACC AA6ATTACGA TTATCTGAAC GAGT6GG6CA 6CCQCTTCAA 2520 

GAAGCTGGCA 6ACATGTAC6 GTX3GCGGGGA GGAOGACTAG OOQGOCTGCC TGCAGGGCTG 2580 

GGGACCAAAC GTCAGGCCAC AGAGCATCTC CAAGGGGTCT CAGTTCCCCX: TTOWSCTGAG 2640 

GACTTCGGA6 CTTGTCAGGA AGTGGCCX3TA OCAACTTGGC GGAGACAGGC TATGAGTCTG 2700 

AOGTTAGAGT GGTTGCTTCC TTAGCCTTTC AGGATGGAGG AATGTGGGCA GTTTGACTTC 2760 

AGCACTGAAA ACCTCTCCAC CTGGGCCAGG GTTGCCTCAa AG60AAGTT TCCAOAAGCC 2820 

TCTTACCTOC GGTAAAATGC TCAACCCTGT GTCCTGGQOC 7GG6CCTGCT GTGACTGACC 2880 

lACABTGGAC TTTCTCTCTG GAATGGAACC TTCTTAGGCC TCCTGGTGCA ACTTAATTTT 2940 

TTTTTTTAAT GCTATCTTCA AAACGTTAGA GAAAGTTCTT CAAAAGTGCA GCCCAGAGCT 3000 

GCTGGGCCCA CTGGCCGTCC TQCATTTCTG GTTTCCAGAC QXAATGCCT CCCATT0G6A 3060 

TGQATCTCTG OGTTTTTATA CTGAGTGTGC CTAGGTTGCC CCTTATTTTT TATTTTOCCT 3120 

GTTGCQTTGC TATAGATGAA GGGTGA6GAC AATOGTGTAT ATGTACTA6A ACTTTTTTAT 3180 
TAAAQAAACT TTTCCCAGAA AAAAA 

Seq ID NO: 59 Protein sequence t 



209 



wo 02/086443 
Protein Accession 8i l]P_001784.2 

1 11 21 31 41 51 

I I I I I 1 

MGLPSGPLAS LLUiQVCRIiQ CAASEPCRAV FREAEVTLBA G6AEQEPGQA LGKVFKGCFG 60 

QEPALPSTDN KJPTVRNGET VQERRSLKER NPLKIFPSKR ILRHHKRDWV VAPISVPESG 120 

R8PFPQRLNQ UCSHKDROTK IFYSXTXn^GA 0SPPB6VFAV EECBVGHlrLUf KPLDHESIAK 180 

YELPGHAVSE NGASVEDPMH ISIIVTDQND HKPKPTQDTP RGSVLEGVLP GTSWQVTAT 240 

DEDDAIYTYN GWAYSIHSQ EPKDPHDLMP TIHRSTGTIS VISSGLDREK VPBYTLTIQA 300 

TDKDGDGSTT TAVAWEZLO ANDNAPMFDP QKYEAHVPEN AVGaEVQRIiT VTDIJ}APKSP 360 

AMRATYLIMO GZU)GOBFTZT THPB6MQGIL TTRK6U>FBA XHQBTIiyVEV THBAPFVLKL 420 

PT8TATIWB VEDVHBAPVF VPPSKWEVQ EGZPTGBPVC VYTAEDPDKB HQKXSYRILR 480 

DPAGWLAMDP DSGQVTAVGT UJREDBQFVR HMIYBVMVLA MDHGSPPTTG TGTUiTLID 540 

VNDHGPVPBP RQITICWOSP VR0VIJ7ITDK DLSPHTSPPQ AQLTDOSDIY WTABVNBEta) 600 

TWLSLKKFli KQDTYDVBLS LSDHGNXEQL TVZRATVCDC EGHVETCPGP WXGGFZIiPVL 660 

GAVLALLFLL LVLLLLVRXK RKIKEPUiLP ED9>TRI»7VFy YGEB6GGEED OnVDITQiaR 720 

GLBARPEWL RNDVAPTIIP TPMYRPRFAN PDEI07FIIE HLKAANTDPT APPyXTTUiVF 780 
DYBGSGSDAA SLSSLT6SA8 DQDQDYDYIM EH6SRFKKLA DMYGGGBDO 

Seq U> NO: 60 ONA Sequence 

Nucleic Acid Accession #: Bos sequence 

Coding sequence: 162-428 

1 11 21 31 41 51 

I I I I I I 

OarrTCOSTT GGCGGCGGAT TCGAACGTTC GGACTGAGGT TTTTCTGCCT GAAGAAGOBT 60 

CATACGGACC GGATTGTTTT CGCTGGCCCA GTGTOOCOGG AGCTTGTGTG CX5ATACAQAG 120 

A6CACCTC66 AAGCTGAGGC AGCTGGTACT TGACAGAGAG GATGGCGCTO TCGACCATAO IBO 

TCTCCCAGAG GAAGCAGATA AAGOGGAAGG CTCCCC3GTGG CTTTCTAAAG CX3AGTCTTCA 240 

AGCGAAAGAA 6CCTCAACTT OGTCTOOAGA AAAGTGGTGA CTTATTGGTC CATCTGAACT 300 

GTTTACT6TT TGTTCATCGA TTAGCAGAAG AGTCCAGGAC AAACGCTTGT GOGAGTAAAT 360 

GTAGAGTCAT TAACAA6GAG CATGTACTGG CCGCA6CAAA GGTAATTCTA AAGAAGAGCA 420 

GAGGTTAGAA GTCAAAGAAC ATATTCTTGA AAGTTATGAT GCATTCTTTT GGQTGGTAAC 480 
AGATCATAAA 6ACATTTTTT ACACATCAGT TAATATGGQA TTATTAAATA TTGG 



Seq ZD NOt 61 Protein seqcuence: 
Protein Accession Eos sequence 



1 11 21 31 41 51 

I I I I I I 

MALS7ZVSQR KQIKKKAPRG FIjKRVFKRJCK PQLRIiEKSta) ZJjVHLNCLZiF VHRLABESRT 
NACASKCRVZ NKEHVUVAAK VZLKKSHG 



Seq ZD HOs 62 DNA sequence 

Nucleic Acid Accession «t NM_000094.2 

Coding sequence i 99-8933 



1 11 21 

1 I I 

GGGCTGGAGG GGCGCTGGGC TOGGACCT6C 
GAGGOGGGGG TCCTAGCTGA OGGCTTTTAC 
C0606CTCTG C6CCG6GATC CTGGCAGAGG 
GAGTGACCTG CAC6CC5CCTT TACQCCGCIG 
OCATTGGC06 CAGCAATTTC CG06AGGTCC 
TCTCTGGAGC AGCCAGTGCA CAG GG TGT G C 
CACGGACAGA GTTCGGCCTG GATGCACTTG 
GTISAGCTTAG CTACAAGGGG GGCAACACTC 
ACCATSTCTT CCTGCCCCAQ CTGQCOCGAC 
CAGACGGGAA QTCCCM3GAC CTGQTQGACA 
TCAAOCTATT TGCTGTGOGG ATCAAGAATG 
CACAGCCAAC CTCCGACTTC TTCTTCTTOG 
TGCCCCTGGT TTCCCGGAGA GIGTGCAOGA 
G6GATQACTC GACCTCTGCT CCAOQAGAOC 
1GAGASTACA GT6GACAG06 6CCASTG6CC 
CTCTGAGGGQ GCTGGGACAG CCACTGC06A 
GTGAGACCAQ TGTGCGGCTG OGGGGTCTCC 
TTGCCCTCTA CGCCAACAGG ATCGGGGA6G 
TAGAAGG60C 6GAACTGACC ATCCAGAATA 
G6AGTGT6CC AGGT6CCACT GGCTAC06T6 
CACAGCA0CA GGAGCTQG6C CCTGGGCAG6 
GCACGGACTA TGAGGTGACC GTGAGCACCC 
CCCTGATGGC TCGCACTGAC GCTTCTGTTG 
CCACATGCAT CCTCCTTTGC TGQAACTTGG 
Q0060CQTGA dACTGGCTTO GAGCCACOQC 
OCTACGAOTT GGATGGGCTG CAGOTOGGCA 
TQGAGG6CCA CGAGGTQGCC ACCCCTGCAA 
TGA6CCCTGT AACA6ACCTG CAAGCCACCG 
GGAGCOCAGT CCCTGGT6CC ACOCAGTACC 
AGOGQVCCCT G6T6CTTCCT GGQAGTCAGA 
GGCTTA6CTA CACTGTGC6G GT6TCTGCTC 
TCCrCACTGT C00CCGG6A0 C0G6AAACTC 
TGTCAGATGC AA06CXSAGT6 AGGGTGGCCT 
6GATTA6CTG GAQCACAGGC AGTGGTCOGG 
CTGCCACAGA CATCACAGG6 CTGCAGCCTG 
TGOGAGGCAG AGAGGAGGGC CCTGCTOCAG 
CAGTGAQQAC GGTCCATGTG ACTCAGGCCA 
66GTT0CTG6 06CCACAG6A TACAGGGTTT 



31 41 51 

I I I 

CAAGGCCACC GCAGGGGGGA GCAAGGGACA 60 

TGCCTAGOAT GAC6CTG0GG CTTCTGGTGG 120 

OGCCCOGAGT GCGA6CCCAG CACAGGGAGA IBO 

ACATTGTGTT CTTACTGGAT G6CT0CTCAT 240 

GCAGCTTTCT GGAAGG6CT6 GTGCTGCCIT 300 

GCTTTGCCAC AGTGCAGTAC AGCX5ATGACC 360 

GCrCTGGGQG TGATGTGATC 060GCCAT0C 420 

GCACAGQGQC TGCAATTCTC CATGTGGCIXj 480 

Cr Otf l g TCCC CAAGGTCT6C ATCCTGATCA 540 

CAGCTGCCCA AAG6CTGAA6 6GGCAGGGGG 600 

CTGACCCTGA GGAGCTGAAG CGAGTTGCCT 660 

TCAATGACTT CAGCATCTTG AGGACACTAC 720 

CTGCTG6TQG 0GTGCCT6TG ACCOGACCTC 780 

TCSGTGCTGTC TGA6CCAAGC A6CCAATCCT 840 

CTGTGACTQ6 CTACAAGGTC CA6TACACTC 900 

GTGAGCGGCA GGAGGTGAAC GTCCCA6CTG 960 

GGCCACTOAC OGAGTACCAA GTGACTGTGA 1020 

CIGTGAGOQG GACAGCTCG6 ACCACTCCCC 1080 

CCACAOCOCA CAGCCTCCT6 GTGGCCTGGC 1140 

7GACATGGG6 GGTCCTCA6T GGTGG6CCCA 1200 

GTTCAGTGTT GCTGCGTGAC TTGGAGCCTG 1260 

TATTTGOCOS CAGTGTGGGG CCCGCCACTT 1320 

AGCA6ACCCT GC6CCCGGTC ATCCTG6GCC 1380 

TQOCTGAQQC TO6TGQCTAC OGGTTGGAAT 1440 

AGAAOGTOGT ACTGCCCTCT 6ATGTGACCC 1500 

CTGAGTACCG CXTCACACTC TACACTCTGC 1560 

OCGTGGTTCC CACTGGACCA GAGCTGCCTG 1620 

AGCT6GC0GG GCAQOGGGTG CGA6TQT0CT 1680 

GCATCATTGT GC6CA6CACC CAGG6GGTTG 1740 

CAGCATTGGA CTT6GATGAC GTTCA6GCT6 1800 

GAGTGGGTOC COGTGAGGGC AGTGCCAGTG 1860 

CACTTGCTGT TCCAGG6CTG 06GGTTGTGG 1920 

GGGGACCOGT CCCTGGAGCC AGTGGATTTC 1980 

AGTCCASOCA GACACTGCCC CGAQACTCIA 2040 

GAACCACCTA CCAGGTGGCT GTGTCGGTAC 2100 

TCATOGTQGC TCGAAOGGAC CCACTGGGGC 2160 

GCAGCTCATC TGTCACCATT ACCTGGACCA 2220 

CCT6GCACTC AGCCCA0G6C CCAGAQAAAT 2280 



210 



wo 02/086443 

COCAGrrGGT TTCTGGGGAO GCCAOQGTGQ CTGAGCTCSGIi TGGACIGGAG CCAGATACIG 2340 

AGTATAOGGT 6CATGT6AGG GCCCATGTGQ C7G60GTGGA TOGGOOO C CT GCCTCTGT G G 2400 

TTQTGAGGAC TGCCCCTGAG CCTGTGGGTC GTGTGTOSAG GCTGCAGATC CTCAATGCTT 2460 

OCaGOGAOST TCTACGGATC ACCTGGGTAQ GGGTCACTOG AGCCACRGCT TACAGACTCG 2520 

GCTGGGGCOS GAGTGAAG6C GGCCCCATGA GGCACCAGAT ACTCCCAGGA AACAC3U3ACr 25B0 

CT6CA6AGAT CCGGGGTCTC GAAGGTQQAO TCAGCTACTC AGT006AGTG ACICCACTTG 2640 

TGGGG6AC0G C6AGGGCACA CCTGTCTCCA TTGTTGTCAC TAOGCC360CT GAG6CTCC3GC 2700 

CAGCCCTGGG GACGCTTCAC GTGGTGCAGC GCGGGGAGCA CTOGCTCAGG CTGCGCTGGG 2760 

AGCOGGTGCC CAGAGCGCAG GGCTTCCTTC TGCACTGGCA ACCTGAQGGT GGCCAGGAAC 2820 

AGTC0C6GGT CCTGGG6CCC GAGCTCAGCA 6CTATCACCT GGA08GGCTG GA6CCA606A 2BB0 

CACAGTAGOO OCSTGAGGCTO AGTGTCCTAO GGGCX30CT0Q AGAAGG60CC TCICCAGAGG 2940 

TGACTGGOCXS CACTGA6TCA CCTOGTQTTC CAAGCATTGA ACT A OST G TG GTGGACACCT 3000 

OGATOGACTC GGTGACTTTG GCCTGGACTC CAGTGTCCAG GGCATOaGC TACATCCTAT 3050 

CCTGGCGGCC ACTCAGAGGC CCTGGCXyVGG AAGTGCCTGG GTOCCOGCAa ACACTTCCAG 3120 

GGATCTCAAG CTCCCAGOGG GTGACAGGGC TAOAGCCTGG GGTCTCTTAC ATCTTCTCCC 3160 

TGAGGOCTGT CCTGGATGGT GT60G6QGTC CTQAGOCATC TGTCACACAO ACGCCAGltST 3240 

GCCCOOGTG G CCTGGCG6AT GTG6TGTTCC TACCAGAT6C CACTCAAGAC AAT6CTCACC 3300 

GTGOGOAGGC TACX3AGC3U3G 6TCCTGGAGC G' i\ : mGlX rn' G6CACTTG6G CCTCnOGSC 3360 

CACAQGCAGT TCAGGTTOGC CTGCTGTCTT ACAGTCATCG GCOCTCCCCA CTCTTCCCAC 3420 

TGAATGQCrC CCSHGAjCCTT 6GCATTATCT TGCAAAG6AT C06TGACATG CCCTACATOG 3480 

AOOCAAGTOS GAACAACCT3 06CACAGCC6 TOGTCAGAGC ICACA6ATAC ATGTTSGCAC 3540 

CAGAT6CTCC TGGGOGCOCSC CA6CAOGTAC CAGGG6TQAT GGTTCTGCTA GTG6ATGAAC 3600 

CCTTGAGAGG TGACATATTC AGCCCCATCC GTGAOGCCCA GGCTTCTGGG CTTAATGTGG 3660 

TGATGTTGGG AATGGCTGGA GCGGACCCAG AGCAGCTGCG TOGCTTGGCG CCGGGTATGG 3720 

ACTCTGTCCA GACCTTCXTC GCC GT GGATG ATG6GCCAAG CCTGGACCAG GCAGTCAGTG 3780 

GTCTQGCCAC ASOCCTQTGT CAGGCATOCT TCACTACTCA G0CC05G0CA GAGOOCTGOC 3840 

CAGTGTATT6 TCCAAA6G6C CAGAAGGGGG AACCTQGAGA GATGGGCCTG AGAGQACAA6 3900 

TTGGGCCTCC TGGCGACCCT GGCCTCCCGG GCAOGACCGG TGCTCCCXSGC CCCCAGGGGC 3960 

CCCCTGGAAG TGCCACTGCX: AAGGGCGAGA GGGGCTTCCC TGGAGCAGAT GGGCGTCCAG 4020 

GCAGCCXrTGG CCGCGCCGGG AATCCTGGGA CCCCTGGAGC CCCTGGCCTA AAGGGCTCTC 4080 

CAGGGTTGCC 76GCCCT0G7 GGGGACCOSG GAGAG0SAG6 ACCTCGAGGC CCAAAGOGGG 4140 

AGCOQ6GGGC TCCCG6ACAA GTCAT0GGA6 GTGAAGGACC TGGOCTTCCT GGGOGGAAAG 4200 

GGGACXXTCG ACCATCXX3GC CCCCCTGGAC CTG6TGQACC ACTGGGGGAC CCAGGACCCC 4260 

GTGGCCCCCC AGGGCTTCCT 6GAACAGCCA TGAAGGGTGA CAAAGQCGAT CGTGGGGAGC 4320 

GGGGTCCCCC TOGACCAGGT GAAGOTGGCA TTGCTCCTGG GGAGCCTGGG CTGCCGGGTC 4380 

TTCCCGGAAG CCCTGGACCC CAACGCCCCG TTOGCCCCCC TGGAAAGAAA GGAGAAAAAG 4440 

GTGACrCTQA GGATGGAGCT CCAGGCCTCC CAGGACAACC TGGGTCTCCX3 GGTGAGCAGG 4500 

GCCXaOGGGG ACCTCCTGGA GCTATTGGCC CX3VAAGGTGA COGGGGCTTT CCAOGGCCCC 4560 

TGGGTGAOGC TGGAGAGAAG GGCGAAOGTG GACCCCCAGG CCCAGOGGGA TCCCGGGGGC 4620 

TGCCAGGGGT TGCTGGAOGT CXHXSGAGCCA AGGGTCCTGA AGGGCXa^CCA GGACCCACTG 4680 

GCCGCCAAGG AGAGAAGGOG GAGCCTGGTC GCCCTGGGGA CCCTGCAGTG GTG6GACCTG 4740 

CTGTTGCIGG ACCCAAAGGA GAAAAGGGAG ATGTGGGGCC CGCTGGGCCC A6AGGAGCTA 4800 

COGGAGTCCA AGGGGAACGG GGCCCACCCG GCTTGGTTCT TCCTGGAGAC CCTQGCCCCA 4860 

AGGGAGACCC TGGAGACCGG GGTCCCATTG GCCTTACTGG CAGAGCAOGA CXXXCAGGTG 4920 

ACTOVGGGCC TCCTGGAGAQ AAGQGAGACC CTGGGOQGCC TGGCCCCCCA GGACCTGTTG 4980 

GCOXCGAGG ACGAGATGGT GAAGTTGGAG AGAAAGGTGA CGAGGGTCCT CCGGGTGACC 5040 

OGGGTTTGCC TGGAAAAGCA GGCGAGOSTG GCCTTCQGGG GGCACCTGGA GTTGGGGGGC 5100 

CTGTGGGT G A AAA66QA6AC CA66GAGATC CTG6AGAGGA TGGACX3AAAT GGCAOCCCTG 5160 

GATCATCT06 ACOCAAOGGT GAC08TGGQ3 AGCGGGGTCC CCCAGGACCC CCGGGACGGC 5220 

TGGTAGACAC AGGAOCTGGA GCCAGAGAGA AGGGAGAGCC TGGGGACCGC 6GACAAGAGG 5280 

GTCCTOGAGG GCCCAAGGGT GATCCTGGCC TCCCTGGAGC CCCTGGGGAA AGGGGCATTG 5340 

AAGGGTTTCX3 GGGACCCCCA GGCCCACA6G GGGACCCAGG TGTCGGAGGC CCAGCAGGAG 5400 

AAAAQGGTGA C0G06GTCCC OCTGG G CTGG AIOGCOSGAG OSSACTOGAT GGGAAACCAG 5460 

GAGCO QC TGG GO0CTCT6GG COBAATGGTG CT6CAG0CAA AGCTQQGGAC CCAGQGAGAG 5520 

ACGGGCTTCC AGGCCTCCX3T GGA6AACAAG GOCTCCCTGG CCCCTCTGGT CCCCCTGGAT 5580 

TACCGGGAAA GCCAGGCXSAG GATGGGAAAC CTGGCCTGAA TGGAAAAAAC G6AGAACCTG 5640 

GGGACCCTGG AGAAGAOGGG AQGAA6GGAG AGAAAGGA6A TTCAC3GCGCC TCTGGGAGAG 5700 

AASGTOSTOA TOOCOOCAAG 6STSAGC6TG GAGCTCCI03 TAT0CTT8GA OOOCAOGOGC 5760 

CTOCAOGCCT CCCA666CCA GTG GG CCCTC CTQGOCAOGG TTTTCCTGGT GTCGCAGGAG 5820 

GCAOQGGCCC CAAGGGTGAC OGTGGGGAGA CTOGATCCAA AGGGQAGCAG GGOCTCCCTG 5880 

GAGAGOGTGG CCTGCGAGGA GAGCCTGGAA GTGTGCOGAA TGTGQATOGG TTQCPGGAAA 5940 

CIGCTGGCAT CAAGGCATCT GCGCTGCG6G AGATCGTGGA GACCTGGGAT GAGAGCTCTG 6000 

GZAGCTTCCT GOCTOTGCCC 6AAOGG06TC GAGGOCOCAA GOGOQACTCA OOOSAACAGO 6060 

GCCCCGCAGO CAAGGA6GGC CCCAT066CT TTGCTGQAGA AGQG0G6CTG AAQGG06ACC 6120 

GTGGAGACCX: TGGCCCTCAG GGGCCACCTG GTCTOGCCCT TGGGQAGA6G GGOXCCCCG 6180 

GGCCTTOOGG CCTTGCOGGG GAGCCTGGAA AGCCTGGTAT TCCCOGGCTC CCAGGCAGGG 6240 

CTGGGGGTGT GGGAGAGGCA GGAAGGCCAG GAGAGAGGGO AGAAC6GGGA GAGAAAGGAG 6300 

AAOG70SA6A ACAGGSCAGA GAXOGCOCTC CrSGACTOOC T9GAACCCCT GGGOCOCCOG 6360 

GAOCCCCTQG CCCCAAG6TG TCT6TGGAT6 A6CCAGQTCC TGGACTCTCT GGAGAACAGG 6420 

QACCCCCTGG ACTCAAGGGT GCTAAGGGQG AGCCGGQCAG CAATGGTGAC CAAGGTCCCA 6480 

AAGGAGACAG GGGTGTGCCA GGCATCAAAG GAGACCGGGG AGAGCCTGGA CCGAGGGGTC 6540 

AGGACGGCAA CCGGGGTCTA CCAG6AGAGC GTGGTATGGC TQGGCCTGAA GGGAAGCOGG 6600 

GTCT6CAG60 T0CAAGAG6C COCOCTGOCC CAOTQC G TGG TCATGGAGAC CCT66ACCAC 6660 

CTOGTOCCCC GQ6TCTT6CT GGCCCTGCAG GACCCCAAGG ACCTTCTG6C CTGAAGGGGQ 6720 

AGCCTQGA6A GACAGGACCT CCA6GAC6QG GCCTGACTGG ACCTACTGQA OCTGTQOGAC 6780 

TTCCTGGACC CCCOGGCCCT TCAGGCCTTG TGGGTCCACA GGGGTCTCCA GGTTTGCCTG 6840 

GACAAGTGGG GQAGACAGGG AA6CCGGGA0 CCCCAGGTG6 AGATG6TGCC AGTGGAAAAG 6900 

ATGGAGACAG AGG6A6CCCT GGTGTOCCAQ GGTCACCAGG TCTG O CT G GC OCIGTOGGAC 6960 

CTAAAOQAGA AOCTGGCCCC ACX3GGGGCCC CTG6ACAGGC TGTG6T0GGG CTC30CTGGAC 7020 

CAAAGG6AGA GAaGGGAGCC CCTGGAGGCC TTGCTGGAGA CCTGGTGGGT GAGCCGGGAG 7080 

CCAAAGGTGA CGGAGGACTG CCAGGGCCGC GAGGCGAGAA QGGTGAAGCT GGCOGTGCAG 7140 

GGGAGCCCGG AQACCCTGGG GAAGATGGTC AGAAAGQGGC TCCAGGACCC AAAGGTTTCA 7200 

AGGGTGACCC AQGAGTOGGG GTCC066GCT CCCCTG6GCC TCCTGGOOCT OCAOGTGTGA 7260 

AG6GAGATCT GGGCCTCCCT 6GCCT60C0G GT G CrOCT GG TGTTGTT6G6 TTCCOGGGTC 7320 

AGACAGGCCC T0QAGGA6AG ATG6GTCA6C CAGGCOCTAG TG6AGA60G0 GGTCT6GCAG 7380 

GCCCCCCA6G GAGAGAAGGA ATCCCAGGAC CCCTGGGGCC ACCTGGACCA CCGGGGTCAG 7440 

TGGGACCACC TQG6GCCTCT GGACTCAAAG 6AGACAAGGQ AGAGCCTGGA GTAGGGCTGC 7500 
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CTGGGGCOOG AG60@U30GT G 006AG OC3IO GOkTOSGGGG TGRAGft TCOC OGCCCCGGCC 7560 

AGGAGGGftCC COGAGGACTC A06G6GCCCC CTGGCAXSCAG GGGAGAGOGT GGGGAGftAGG 7620 

GTGATGTTGG GACTGCA6GA CTAAAGGGTG ACAA06GA6A CTCAGCTGTG ATCCTGGGGC 7680 

CTCCAGGCCC ACGCSGGTGCC AA6GGGGACA TGGGTGAAOG AG66CCTGSG GGCTTGGATG 7740 

GTGACAAAG6 AC C T O G G GGA GACAATGGGG AOOCTGGTGR CAA66GCAGC AAGGGAGAGC 7800 

CTGGTQACAA G6GCTCASCC GGGTTQCCAG GIVCItSOSIGG ACTCCIGGGft CCCCA06GTC 7860 

AAOCTGGTGC AGCAGG6ATC GCTGGTGACC 06G6RTGC0C AGGAAAGGAT G6AGT6CCTG 7920 

GTATCX3GAGG AGAAAAAGGA 6ATGTTGGCT TCATOGGTCC C06GG6CCTC AAG6GT6AAC 7980 

GGGGAGTGAA GGGAGCCTGT GGCCTTGATG GAGAGAAGGG AGACAAG6GA GAAGCTGGTC 8040 

CCCCA0GC06 CCCOGGGCTG GCAGGACACA AAGGAGAGAT GG6GGAGCCT GGTGT6C0GG 8100 

GCCAGTGGG6 GGC00CTG6C AAGGAGG6CC TGATOGGTCC CAA6GGTGAC OGAGGCTTTG 8160 

AGGGGCAGCC AG6CC0CAA6 GGT6ACCAGG GOGAGAAAGG GGAGOGGGGA ACCCCAGGAA 8220 

TTGGGGGCTT CCCAG6CCCC AGTGGAAATG AT60CTCTGC TGGTCCCCCA GGGCCACCTG 8280 

GCAGTGTT6G TCCCAGA6GC CCCGAAGGAC TTCAGGGCCA GAAGGGTGAG OGAGGTCCCC 8340 

COSGAGAGAG AGTGGT6GGG GCTCCTGGGG 7CCCTGGA6C TCCTG G OGAG AGAGGGGAGC 8400 

AGGGG06GCC AGGGOCTGOC GGTCCTGGA6 GCGA6AAGG0 AGAAGCTGCA CTQAOGGAQG 8460 

ATGACATCOG GGGCTTTGTG 06CCAAGAGA T6AGTCAGCA CTGTGOCTGC CAGGGCCAGT 8520 

TCATCGCATC TGGATCACX3A CCCCTCCCTA GTTATGCTGC AGACACTGCC GGCTCCCAGC 8580 

TCCATGCTGT GCCTGTGCTC CGC6TCTCTC ATGCAGAGGA GGAAGA6CGG GTACCCCCTG 8640 

AGGATGAT6A GTACTCTGAA TACTCOSAGT ATTC1X3TGGA GGAGTAOCAG GACCCTCAAG 8700 

CTOCTTGGGA TAGTGATGAC CCCTGTICOC TGCCACTGGA TGAG66CT0C tGCACTGCCT 8760 

ACACGCTGCG CTGGTACCAT C6GGCTGTGA CAGGCAGCAC AGAG60CTGT CAe XVlTlT G 8820 

TCTATGGTGG CT6TG6AGGG AATGCCAACC GTTTTGGGAC CCGT6AGGCC TGOGAGCGCC 8880 

GCTGCCX3^ CCX^GGTGGTC CAGAGCCAGG GGACAGGTAC TGCCCAG6AC TGAGGCCCA6 8940 

ATAATGAGCT GAGATTCAGC ATCCCCTGGA G6AGTC7GGGG TCTCAGCAGA ACCX:CACTGT 9000 

CCCTCCCCTT Q6TGCTAGAG GCTT6TGTGC AOGTGAGCGT G06AGTGCAC GTCOGTTATT 9060 

TCAGTGACTT GGTCCGGTGG GTCTAGCCTT CCCOOCTGT G GACAAACCCC GATIGTGGCT 9120 

CCTGCCACCC TGGCAGATGA CTCACTGTGG GGGGGTGGCT GTGGGCAOTG AGGGGATGTG 9180 

ACTGGCGTCT GACCCGCCCC TTGACCCAAQ CCTOTGATGA CATGGTGCTG ATTCTGGGGG 9240 
6CATTAAA6C TGCTGTTTTA AAAGGCAAAA AA 

Seq ID KO: 63 Protein sequence: 
Protein Accession NP_000085.l 

1 IX 21 31 41 51 

i I I I i I 

MTLRLLVAAL CAGIIAEAFR VRAQHRERVT CTRLYAADZV FLLDGSSSIG RSNFREVRSF 60 

LEGLVLPPSG AASAQGVRFA 7VQYSDDPHT EF6U2AL6SG GDVIKAXRBL SYKGGNTRTG. 120 

AAIIfVADHV FLPQLARPGV PKVCILITDG KSQDLVDTAA QSLKGQGVKL FAV6IIQIA0P 180 

EELKRVASQP TSDPPPFVND FSILRTLLPL VSRRVCTTAO GVPVTRPPDD STSAPHDLVL 240 

SEPSSQSLRV QWTAAS6PVT GYKVQYTPLT GLGQPLPSER QEVNVPAGBT SVRLRGLRPL 300 

TEYQVTVIAL YANSIGBAVS GTARTTALBS PELTICZNITA HSLLVAHX^ FGATGYRVTW 360 

RVLSGGFTQQ QSL6P0QG5V LIi!U>LEPGTD' YEVrVSTLFG RSVGPATSIM ARTDASVEQT 420 

lAFVILGPTS ILLSWNLVPB ARGYRLEHRR ETGLEPPQKV VLPSDVTRYO LDGLQPGTBY 480 

RLTLYTIaLEG HEVATPATW PTGPELPVSP VTDLQATELP GQRVRVSWSP VPGATQYRII 540 

VRSTQGVERT LVLPGSQTAF DLDDVQA6LS YTVRVSARVG PREGSASVLT VRREPETPLA 600 

VPGLRVWSO ATRVRVAWGP VFGASGFRIS H5TGSGPE&S QTLPPDSTAT DITGLQPGTT 660 

YOfVAVSVLRO REEGPAAVIV ARTDPLGSVR TVHVTQASSS SVTXmTRVP GATGYHVSHH 720 

SABGPEKSQIi VSGEATVAEZi DGLEPDTSYr VRVRAHVAGV OGPPASWVR TAFEPVGRVS 780 

RLQZLNASSD VLRITWVGVT GATAYRLAHG RSE6GFNRHQ ILPGNTDSAE IRGLEGGVSY 840 

SVRVTALVGD REGTPVSIW TTPPBAPPAL GTLHWORGB HSLRLRWEPV PRAQGFUiHW 900 

QPBGGQBQSR VLGPELSSYH LDGLEPATQY RVRLSVI^A GBGPSAEVTA RTESPRVPSI 960 

ELRWDTSZO SVTLAMTFVS RA8SYILSNR PUIGPGQEVP GSPQTXiPGIS SSQRVTGLEP 1020 

GVSYXPSLTP VLDGVR6PBA SVTQTFVCPR GLADWFIiPH ATQZSNABRAB ATRRVLERLV 1080 

liALGPLGPQA VQVGLLSYSH RPSPLFPWG SHDLGIIXiQR IRDMFYMDPS GMNLGTAWT 1140 

AHRYMLAPDA PGRSQHVPGV HVLLVDEPLR GDIFSPIREA QASGLNWML 01AGADPEQL 1200 

RRIiAPGMDSV QTFFAVDDGP SLDQAVSGLA TALOQASFTT QPRPEPCPVY CPKOQKGEPG 1260 

EM6LRGQVGP PGDPGLP6RT 6AP6PQGPP6 SATARSERGF FGADGRPGSP 6RAGNFGTPG 1320 

AP6LK6SPGL PGPRO^PGBR GPRGPKC^PG AP6QVZG6E0 PGLPGRKGDP GPSGPPGPRG 1380 

PLGDPGPRGP PGLPGTAMKG DKGDRGERGP P6PGEGGIAP GEPGLPGLPG SPGPQGPVGP 1440 

PGKKGBKGDS EDGAPGLPGQ PGSPGBQGPR GPPGAIGPKG DRGFPGPLGE AGBK6ERGPP 1500 

GFAGSRGLPG VAGRPGAXGP EGPPGPT6RQ GEKGEPGRPG DPAWGPAVA GPXGEXGDVG 1560 

PAGPRGATGV QGERGPPGLV LP(S>POPiaa> F6DRGPZGLT 6RAGPPGDS6 VKE3SSDPGR 1620 

P6PPGFVGPR GRDGEVGBXO DBGPPQDPGL PGKA6ERGLR GAP6VRGPV6 FXXSKtBDPGB 1680 

DGRNGSPGSS 6PKGDRGEPG PPGPPGRLVD TGPGAREKGE PGDRGQE6PR OPKGDPGLPG 1740 

APGER6IEGF RGPPGPQGDP GVRGPAGEKG DRGPPGLDGR SGXiDGKPGAA GPSGPHGAAG 1800 

KAGDPGRDGI* PGLRGEQGIiP GPSGPPGIiPG KPGSDGKPGX. NGKNGEPGDP GEDGRKGERG 1860 

DSGASGRBQR D6PK8ERGAP 0ZL6PQGPP0 LFGPV8PPGQ (^PGVPGGTG FXGDRGBTGS 1920 

JC0BQ6LP6ER GLHGEP6SVP HVDSLLBTAG IKASALRSZV ETWDESSGSF LPVPERSR6P 1960 

KGDSGBQGPP GKEGPIGFPG ERGI.K6DRGD PGPQ6PPGLA liGERGPFGPS GLAGEPGKPG 2040 

ZPGLPGRAGG VGEAGRPGER GERGEKGERG BQGRDGPPGL PGTPGPPGPP 6PKVSVDEPG 2100 

PGLSGEQGPP GLKGAKGEPG SN(a3QGPX6D RGVPGXKGDR GEPGPRGQDG NPGLPGERGM 2160 

AQPE6KPGLQ OFSOPFGFVG 6B6DFGPPQA PGLAGPAGPQ GPSGLSGEPG BT6PF6RGLT 2220 

GPTGAVGLPG PPGPSGLVGP QGSP6LPGQV GBT6KP6APG SDGASGKDGD R6SP6VPGSP 2280 

GLPGPVGPKG EPGPTGAPGQ AWGLPGAKG EKGAPGGIiAG DLVGEPGAKG DRGLPGFRGE 2340 

XGEAGRAGEP GDPGEDGQKG APGPKGFKGD PGVGVPGSP6 PPGPPGVKGD LGLPGLPGAP 2400 

GWGFPGQTG PRGEMGQPGP S6ERGLAGPP GRBGIPGPLG PPGPPGSVGP PGASGLKGDK 2460 

6DPGVGLPGP RGER6BPGIR GBDGSLPGQBB PRGOiTGPPGS RGSRGEK8DV GSAOLXSDKO 2520 

DSAVZLGFPG PRGAK6DMG6 RGPSGLDGDK QPBGDSGDVG OKGSKGEPGD KGSAGLPGLR 2580 

GLXiGPQGQPG AAGIPGDPGS PGKD6VP6IR GEKGDVGFK6 PRGLKGERGV RGAOGUX^EK 2640 

(33XGEAGPPG RPGLAGHKGE MGEPGVPGQS GAPGKEGLXG PKGDRGFDGQ PGPK£a)QGEK 2700 

GERGTPGIGG FPGPSGtlDGS AGPP6PPGSV GPRGPEGLQO QKGERGPPGB RWGAFGVPG 2760 

APGERGBQGR PGPAGFRGEK GBAALTBDDI RGFVRQEMSQ HCAOQQQFIA SQSRFLPSYA 2820 

ADTA6SQLBA VFVLRV8BAE EBERVPFBDD SYSBYSEYSV EEYQDPBAPW DSDDPCSLPL 2880 

OBSSCTAXTL RHYHRAVTSS TBACBPFVYG GOGQIAHRFG TREACESRCP PRWQSQQIG 2940 
TAQD 
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Seq ZD NO: 64 DNA seguence 
Nucleic Acid Accession #t im_006945 

Cdding sequence t 1~219 



1 11 21 31 41 51 

111)11 
ATGTCTTATC AACAGCAGCA GT6CAA6CAG CCCT G CCAGC CACCTCCTGT GTGOCCCAOO 60 
CCAAAGTGCC CAGAGCCATG TCCACCCCOG AAGTGCCCTG AGCCCTGCCC AOCACCAAAG 120 
TGTOCACAGC CCTGCCCACX TCAGCAGTGC CAGCAGAAAT ATCCTCCTGT 6ACACCTT0C 180 
GCACCCnSOC A6CCAAAGTA TCCAC06AAG A6CAAGTAA 

Seq ID HO: 65 Protein sequence i 
Protein Accession $: NP_008876 

1 11 21 31 41 51 

I I I I I I 

MSYQQQQCKQ PCQPPPVCPT PKCPEPCPPP KCPEPCPPPX CPQPCPPQQC QQKYPPVTPS 60 

PPCQPKYPPK SK 

Seg ID NO: 66 DNA sequence 

tfucleic Acid Accession #: KN_005629.1 

Coding sequence: 639'2S46 



1 11 21 31 41 51 

1 i I I I I 

TAGTCGGAGC GAGGTGGOGA GT06CTGAGC CG6CCG0GGC CC0GAGA606 GCTGCAGCOS 60 

COGGCGGOQQ GAA0GAGAG6 GOGA6GOGOG CCOGAGCC36C C3GCCX;C06CC GCCACOGCCG 120 

C06C06CCAC CACOGCCACC 66AGT0G066 GCCMSOOQGG CAGCCTCG6C GGGCXXCGGC IBO 

0GGGG06GGG GGCG06GGCC ACAG6CCCCT GCTCCGGCOQ TCGTTTGCAO ACCGCGGGCG 240 

COGATGTOGC CCG06CCCC3G TTAGGATGAO TCTOGGGTOS GGCGAGGAGC CGCOGCAGCC 300 

GCOGCOGCCC GAGCCGOGGG CAGGAGCCTC GGGAGCCGCC GCOGCCGCCG CCGCOGCCOG 360 

6C0GGQCXKX GA06C0SCCC OOGOGCOCCC GGGCCCC06A CACACATGA6 ATTCTTCAGG 420 

CTCACTTTCA ASTOCTTOCST G6ACTGCTTC TGACTGGGCC GCCXXX33CCC CGCACCCOGC 460 

OSTCOGCCOG C06CCCOGTC CCCOGGCCOG GCX3GC3CGCGC GGCCCC066C CGGCC06CGC 540 

CXrrCGGGGCC CTCCCCXOTG CCGCCGGTGC CCXXXS3CCT0 KC05CCGCCC CCOGTGAGGC 600 

GCCGOGACCC CGGCCCGG(X GTGCGGOCCG C06GGQ0CAT G606AAGAA6 AGOGCOGAGA 660 

ACGGCATCTA TAOOGTGTCC GGOGACGAGA AGAAG6GGCC CCTCATCX30G CC06GGCCCX3 720 

A0GGG8C0CC GGCCAAGOGC GAG6GC0006 TGGGOCTGGG GACACO0G6C G6C06CCZGG 780 

CCGTGCG6CC GCX3CGAGACC T0GAG605CC AGATGGACTT CATCATGTOQ T6C3GTQGGCT 840 

TCGCCGTGGG CTTGGGCAAC GTGTGGOGCT TCCCCTACCT GTGCTACAAG AAOSGCGGAG 900 

GTGTGTTCCT TATTCCCTAC GTCCTOATCXS CCCTGGTTGG AGGAATCCCX: ATTTTCTTCT 960 

TAGAGATCTC GCT6GGCCAO TTCATGAA66 CC6GCAGCAT CAATGTCTGG AACATCTGTC 1020 

OOCIGTTCAA AGGCCTGGGC TAOGOCTCCA TGGTGATOGT CTTCTACTOC AACACCTACT 1080 

ACATCATOGT GCTGGCCT GO G6CTTCTATT ACCTGGTCAA GTCCTTTACC ACCAOGCIGC 1140 

CCTGGGCCAC AT6TGGCCAC ACCTGGAACA CTCCOGACTG OGTGGAGATC TTCCGCCATG 1200 

AAGACTGTGC CAATGCCAGC CTGGCCAACC TCACXTTOTGA Ca^aCTTGCT GACCGCCQGT 1260 

OCCCTGTCAT 06AGTTCTGG GAGAACAAAO TCTTGA66CT GTCTGGGGGA CTGGAGGTGC 1320 

CAGQGGCX3CT CAACI6QGA0 OTGAOOCTTT GTCTGCTGGC CIGCXG60T6 CrOGTCTACT 1380 

TCTgitrrCT G GAAOGGOGTC AAATOCAOGG GAAA6AT06T GTACTTCACT GCTACATTCC 1440 

CCTAOGTGGT CCT6GTOGTG CTGCTGGTGC GTGGAGTGCT GCTGCCTGGC GCCCTGGATG 1500 

GCATCATTTA CTATCTCAAG CCTGACTGGT CAAAGCTGGG GTCCCCTCAO OTGTGQATAG 1560 

AT60GG6GAC CCAGATTTTC TTTTCTTAOG CX:ATTGGGCT GGQGGCCCTC ACAGCCCTGG 1620 

GCAGCTACAA CGGCTTCMC AACAACTGCT AGAAOOAOGC GATCATOCTG GCTCTCATCA 1680 

ACAGTGG6AC CAGCTTCTTT GCTG6CTT0G TGGTCTTCTC CATCCTGGGC TTCAT6GC7G 1740 

CAGAOCAGQG CGTGCAC3VTC TCCAA6GT6G CAGAGTCAGG GCCGGGCCT6 GOCTTCATCG 1800 

CCTACCX3Q0G GGCTGTCAOO CTGAT6CCAG TGGOOOCACT CTGGGCTGOC CTGTTCTTCT 1860 

TGATGCTGTT GCTGCTTGGT CTCGACAGCC 'AGTTTGTAGG TOTGGAGGGC TTCATCACCG 1920 

GCCTOCrOGA CCTCCTOCOG CCCTCCTACT ACTTCOGTTT CXAAAGGGAG ATCTCTGIGG 1980 

CCCTCTGTTG TGCCCTCTGC TTTGTCATOO ATCTCTOCAT GGrGACTGAT 6G0GGGATGT 2040 

ACGTCTTCCA GCTQTTTGAC TACTACTOGG CCAGCGGCAC CACCCTGCTC TGGCAOGCCT 2100 

TTTQGGAGTG OGTGGTGGTG GCCTGGGTGT AC3GGAGCTGA CCGCTTCATG GAOGACATTG 2160 

CCIQTATGAT CX^GGTACOGA CCTT6CCCCT GGATGAAAT6 GTGCT6GT0C TTCTTCACCC 2220 

G6CTGGTCT0 CATGG6CATC TTCATCTTCA AOGTTQTGT A CTACX3A6C00 CTOGTCTACA 2280 

ACAACACCTA 0GTGTACC08 TG G TOQGGTG AG6CXA3GG0 CTGGGCCTTC GCCCTGTCCT 2340 

CCATGCTGTO CX5TG006CTG CACCTCCTGG GCTGCCTCCT CAGGGCXaAG GGCACCATGG 2400 

CTGAGCXSCTG GCAGCACXTC ACCCAGCCCA TCTGGGGCCT CCACCACTTO GAGTACOGAG 2460 

CTCAGGAOGC AGATGTC3U36 GGCCTGACCA CCCTGACCCC AGTGTCOGAQ AGCAGCAAGG 2520 

TOSTOGtGOT GGAGAGTGTC ATGTGACAAC TCAGGTCACA TCAOCAGCTC AOCTC TOOIA 2580 

GGOVTAGCAG CCCCTGCTTC AGCQXACCO OVOOCCTCCA G0G66OCTGC CTTTCCCTGA 2640 

CACTTTTGGG GTCTGCCTGG GGGAGGAGGG GAGAAAGCAC CATGAGTGCT CACTAAAACA 2700 

ACTTTTTCCA TTTTTAATAA AACGCCAAAA ATATCACAAC CCACCAAAAA TAGATGCCTC 2760 

TCCOCCKCA GCCCTA6C06 AGCTGGTCCT A66CCC03CC TAGTGCOOC A CCCCCACXXA 2820 

CAGTOCIGCA CTGCTCCTGC CCCTGCCAOG OCCAOOOOCT GCCCAOCICT GGAGGCTCIO 2880 

CTCT6CAGCA CACCC6TGGG TGACCX3CTCA CCCCA6AAGC AGCAGTGGCA GCTTGG6AAA 2940 

TGTGAGGAAG GGAAGGAGGG AGAGAOGGGA GGGAGGAGAG AGAGGAGAA6 GGAGGCAG66 3000 

GAGQGGGAGC AGAACCAAGG CAAATATTTC AGCTGGGCTA TACCCCTCTC CCCATCCCTG 3060 

TTATAGAA6C TTA6AGAGCC AGCCA6CAAT 6GAA0CTTCT OGTTOCTGGQ GCAATOGCCA 3120 

GCAGTATCAA TIGTSTGAGC TT G GGTOOGA GltSGAOGOGT GGGTGAGTAC. GGAGAGIATA 3180 

TATAGATCTC TATCTCTTAG CAAAGGTGAA TGCCAGATGT AAAT060SCC TCTGGGCAAA 3240 

GGAGGCTTGT ATTTTGCACA TTTTATAAAA ACTTGAGAGA ATGAGATTTC TGCTTGTATA 3300 

TTTCTAAAAA GAGGAAGGAG CCCAAACCAT OCTCTCCTTA CCACTCCCAT CCCTGTGAGC 3360 

OCTACCTTAC CCCTCTGCCC CTAGCCAAGO AGTOTGAATT TAXAGATCTA ACTTTCATAG 3420 

GCAAAACAAA AGCTTOGAGC TGTT606T6T 6TGA6TCTG7 TGTGTGGATG T60GTGTGT6 3480 

GTCCCCAGOC CCAGACTGGA rTGGAAAAGT 6CAT0GTGGG GQCCTOGQGQ CTGTCCCCAC 3540 

GCTGTCCCTT T6CCACAAGT CTGTOSGGCA AGAGGCT6CA ATATTCOGTC CTGGGTGTCT 3600 

GGGCTGCTAA CCTGGCXnX^C TCA6GCTTCC CACCCTGTGC GGGGCACACC CCCA6GAAGG 3660 

OACCCTSGAC AOQOCTCCCA CGTCCAGGCT TAAGGTGGAT 6CACITCC0G CACCTCCAGT 3720 
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CTTCTGTGTA GCAGCTTTAft GCCAOGmG TCTGTCACSST CCAGTCOOQH GAOQGCTGAG 3780 
7GA0C0CAA6 AAAG6CTTCC GOGACAOCCA GACAGA8GCT GCACGSCTSQ OOCTGGGTGA 3640 
GGGIGGOSGG OCTGCX SGaGA GATTCTACTQ T6CTAAAAA0 CCACTGCAQA CATAOCAATA 3900 
AAAACATGTC ATTTTGC 

Seq ID NOt 67 Protein sequence t 
Protein Accession 9: XIP_005620.1 

I 11 21 31 41 51 

i I I I I I 

KAKKSAEVGZ ySVSGDEKKG PLZAPGFDGA PAKGSIGPVGL GTF66SIAVP PR£TW7B(9fD 60 

PZHSCVGFAV GDGNVHRFPY LCYKN6GGVF LIPYVLIALV GGIPZFFLEI SLGQFMKAGS 120 

mvWNICPLP KGLGYASMVI VFYOmfYIM VLASraFVYLV KSFTTTLPWA TOGHTWNTPD 180 

CVEIFRHBDC AHASLANLTC DQLADRRSPV IBFWENKVLR LSGGLBVPGA UIWEVTLCLL 240 

ACHVLVyPCV MXGVK5T6KZ VYFTATFPYV VI.WI«LVRi6V LLPGALDGZZ YYLKPDWSKL 300 

GSPQVHIDA6 TQIFFSYAIG LGALTAIiGSy NRFNNKCyKD AZIIALINSO TSFPAGFWP 360 

SILGFMAABQ GVHISXVAES 6PGLAFIAYP BAVTLHFVAP LNAALFFFML hUUSLDSQFV 420 

GVEGFZTSIiL DLLPASVYFR PQRSISVALC CALCFVIDIiS MVTDGGMYVF QLFDYYSA5G 480 

TTLMJQAFWE CWVAKVYGA DRFMDDIACM IGYRPCPWMK WCWSPPTPLV CMGIPIFNW 540 

YYEPLVYNNT YVYPHffGEAM GHAFALSSMIi CVPLHLL6CL LRARGTMAER HQRLTQPIWG 600 
LHKLEYRAQD ADVRGLTTLT PVSESSKVW VBSVK 



Seq ID KOt 68 DKA sequence 

Nucleic Acid Accession ft: X9M_021953.1 

Coding sequence: 17B-2469 ~ 

I 11 21 31 41 51 

11)111 

GGCACGAGGG GGACCOGGCC GGTCCGGOSC GAGCCCCOGT CCGOGGCCCT GOCTCGGCXX: 60 

CXIAGGTTGGA GGAGCCX30GA GCCX:GCXrrTC GGAGCTAOGG CCTAACGGCG GCGGCGACTG 120 

CAGTCTGGA6 6GTCCACACT TGT6ATTCTC AATGGAGAGT GAAAAC6CAG ATTCATAATG 180 

AAAGCXAGOC COCXnCBGCC ACTGATTCTC AAAAGAOSGA 6GCT6CCCCT TOCTGTTCAA 240 

AATGCCCCAA GT6AAACATC A6AGGAOGAA CCTAAGAGAT CCCCTGOCCA AC3U^6TCT 300 

AATCAAGCAG AGGCCTCCAA GGAAGTGGCG GAGTCCAACT CTT6CAAGTT TCCA6CTG6G 360 

ATCAAGATTA TTAACCACCC CACCATGCCC AACACGCAAG TAGTGGCCAT COCCAACAAT 420 

GCTAATATTC ACA6CATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 480 

GGGGCCAACA AATTCATCCT CATCAGCIGT GGGG6AQCCC CAACTCAeCC TCCA6GACTC 540 

OGGCCTOUUl CCCAAACCAG CTATGATGCC AAAA6GACA6 AA6TGACCCT GGAGACCTTG 600 

GGACCAAAAC CTGCAGCTAG GGATGTGAAT CTTCCTAGAC CACCTGGAGC GCTTTGCGAG 660 

CAGAAACGGG AQACCTGTGC AGATGGTGAG GCAGCAGGCT GCACTATCAA C3UITAGCCTA 720 

TCCAACATCC AGTGGCTT06 AAAGATGAGT TCTGAT6GAC TGGGCTCCCG CA8CATCAAG 780 

CAAGA6AZG6 AGGAAAAGGA GftATTGTCAC CTGGAOCAGC GACA6GTTAA GGTrSAGGAG 840 

CCTTCQAGAC CATCAGCGTC CT66CA0AAC TCTGTCTCTS AG066CCACC CTACTCT7AC 900 

ATGGCCAT6A TACAATTOGC CATCAACAGC ACTGAGAGGA AGCGCAT6AC TTTGAAAGAC 960 

ATCTATAOGT GGATTGAGGA CCACTTTCCX: TACTTTAAGC ACATTGCCAA GCCAGGCTGG 1020 

AAGAACTCCA TCC3GCCACAA CCTTTCCCTG CACGACATGT TTGTCC3GGGA GAOGTCTGCC 1080 

AATGGCAAGQ TCTCCTTCTQ 6ACCATTCAC CCCAGTGOCA ACG6CTACTT GACATTGGAC 1140 

CA0GT6TTTA AGCCACTGGA CCCAGGGTCT CCACAATTGC C0GA5CACTT GGAATCACA8 1200 

CAGAAACGAC C3GAAT0CAGA GCTCC36CCX3G AACATGACCA TCAAAACXXa^ ACTCCCCCTG 1260 

GGCGCACGGC GGAAGATGAA GCCACTGCTA CCACGGGTCA GCTCATACCT GGTACCTATC 1320 

CAGTTCCC6G TGAACCAGTC ACTGQTGTTG CAGCGCTOGG TGAAGGTGCC ATTGCCCCTG 1380 

GCGGCTTCCC TCATOAGCTC AGA6CTT6CC OGCCATAGCA AG06AGTCOG CATTGCCCCC 1440 

AAGQTGCTGC TA6CTGAG6A GGGGATAGCT CCl'C m ' Ly r CTGCAGGACC AGGGAAA6A6 1500 

QAGAAACTCC TGTTTGGAGA AGGGTTTTCT CCTTTGCTTC CAGTTCAGAC TATCAAGGA6 1560 

GAAGAAATCC AGCCTGGGGA GGAAATGCCA CACTTAGOSA GACCCATCAA AGTOGAGAGC 1620 

CCTCCCTTG6 AAGAGTGGCC CTCXOOGGCC CCATCTTTCA AAQAGGAATC ATCTCACTCC 1680 

TGGGAGGATT QGTCCC3UITC TCCX310GCCA AGAOOCAASA A6TCCIACAG TGGGCTTAG6 1740 

TCCCCAACCC GGTGTGTCTC GGAAATGCTT GTGATTCAAC ACA666AGA6 GAGGGAGAC36 1800 

AGCCGGTCTC GGAGGAAACA GCATCTACTG CCTCCCTGT6 T6GATGAGCC (S^AGCTGCTC 1860 

TTCTCAGAGG GGCCCAGTAC TTCCC3GCTGG GCCGCAGAGC TCCOGTTCCC AGCAGACTCC 1920 

TCTGACCCTG CCTCCCAGCT CAGCTACTCC CAGGAAGTGG GA6GACCTTT TAAGACACCC 1980 

ATTAAGGAAA 08CZ60CCAT CTCCTCCACC CGGAOCSUUIT CTGTCCTCCC CA82JICCCCT 2040 

GAATCCT66A GGCTCAOGCC CXXAGOCAAA GTAGGOGGAC TGGA7TTCAG CCCA6TACAA 2100 

ACCTCCCAGG GTGCCTCTGA CCCCTXGCCT GACCCCCTGG G6CTGATGGA TCTCAGCACC 2160 

ACTCCCTTGC AAAGTGCTCC CCCCCTTGAA TCAOOGCAAA GGCTCCTCAG TTCAGAACCC 2220 

TTAGACCTCA TCTCCX5TCCC CTTTGGCAAC TCTTCTCCCT CA6ATATAGA OGTCCCCAAG 2280 

CCAGGCTOCC 096AG0CACA OOTTTCTGGC. CTTGCAGGCA AT08TTCIC7 GAGAGAAGGC 2340 

CT6GTCCTG0 ACACAAT6AA TGACAOCCTC A0CAA6ATCC TOCTGGACAT CAGCTTTCCT 2400 

GGCCTGQA06 AGGACCCACT GGGCCCTGAC AACATCAACT GGTCXTAGTT TATTCCTGAG 2460 

CTACAGTA6A GCCCTOOCCT TGCCCCXGTQ CTCAAGCTOT CCACCATCCC GG6CACTCC3V 2520 

AGGCTC AGTO CAOO CgAG C CTCTGAG1X3A GGACAGCAGG CAGGGACZGT TCTGCTCCTC 2580 

ATA6CTCCCT GCTGCCTGAT TAT6CAAAA6 TAGCAGTCAC ACCCTAGCCA CTSCTGGGAC 2640 

CTTGTGTTCC CCAAGAGTAT CTGATTOCTC TGCT G T C OCT GCCAGGAGCT QAAGQGTGGG 2700 

AACAACAAAG GCAAT6G1GA AAAGA6ATTA GGAACCCCCC AGCCTGTTTC GATTCTCTXSC 2760 

CCAGCAGTCT CTTAOCTTCC CTGATCTTTQ CAGGGTGGTC CG7GTAAATA GTATAAATTC 2820 

TCCAAATTAT CCTCTAATTA TAAATGTAAG CTTATTTCCT TAGATCATTA TCCAGAGACT 2680 

GCCAGAAGGT GG6TAGGATG ACCT GG GGIT TCAATTGACT TCT G T TC iCri' GCTTTTAGTr 2940 

TTGATAGAAG G(3UIGACCTG CAGT6CAQGG TTTCTTCCAG GCTGAGGTAC CTGGATCTTG 3000 

GGTTCTTCAC TGCAGGGACC CAGACAAGTG 6ATCTGCTTG CC3VGAGTCCT TTTTGCCCCT 3060 

CCCTGCCACC TCOCCGTGTT TCCAAGTCAG CTTTCCTGOl AGAAGAAATC CTGGTTAAAA 3120 

AAGTCTTTTG TATTGGGTCA G6AGTTGAAT TTG6GGTGGG AG6ATGGATG CAACTGAAGC 3180 

AGAGTGTC G G TGCCCAGATG TGOGCTATTA GATGTTTCTC TGAIAATGTC CXXAATCATA 3240 

CCAGGGAGAC TG6CATTGAC GAGAACTCAG QIG QAGQ CTT GAQAAQGCOG AAAG6GC0CC 3300 

TGACCTGCCT GGCTTCCTTA GCTTGOCCCT CASCTTTGCA AAGAGCCACC CTAOGCCCOV 3360 

GCTGACCGCA TGGGTGTGAG CCAGCTTOAG AACACIAACT ACTCAATAAA A006AAGGTG 3420 
6ACC3IAAAAA AAAAAAAAAA AAAA 
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PCTAJS02/12476 



Seq ZD KO: 69 Protein sequence: 
Protein Accession St NP_068772.l 

Si 11 21 31 41 51 

1 I I I I t 

MKASPRRPZiX LRRKRLPLFV QKAPSETSEB EPXRSPAQQB SKQAEASKEV AESBSCKFPA 60 

GIKIIHHPTM POTQWAIPN NAMIHSIITA LTAKGKBSGS SGPNKPILIS CGGAPTQPPG 120 

LRPQTQTSYD AKRTBVTLBT LGPKPAARDV NLPRPPGALC BOKRBTCADQ EAAGCTINIIS 180 

10 LSVZQWIAKM SSDGUSSRSl KQEKEEKBNC BZiBQRQVKVE EPSRPSASHQ VSVSBRPPYS 240 

YKAMXQFAXN STBRRRMTLK DIYTNZEDHP PYFKaiAKPG WKNSIRHMLS XADNFVRETS 300 

AIIGKVSFHTZ BPSAKRYLTL SQVFKPIiDPO SPQLPEBLES QQKRPNFELR RNMTXKTeLP 360 

LGARRKMKPL LPRVSSYLVP IQFPVNQSI.V LQPSVKVPLP lAASLMSSKL ARHSKRVRIA 420 

PKVLLAEEGI APLSSAGPGK EEKLLFCTGP SPIiLPVQTIK BEEIQPGEEM PHLARPIKVE 480 

15 SPPLEEHPSP APSFKEESSH SWEDS5QSPT PRPKKSYSGli RSPTRCVSEH LVZQHRERRE 540 

RSRSRRKQHL LPPCVDEPEL LFSB6PST5R WAAELPFPAD SSDPASQLSY SQEVG6PFKT 600 

PIKBTLPISS TPSKSVLPRT PESWRIiTFPA KVOSLDPSPV QTSQGASDFL PDPLGUfDLS 660 

TTFLQSAPPL BSPQRIiLSSB PLDLZSVPFQ KSSPSDXOVP XP6SPEPQVS GLAAKRSLTB 720 

Ghvwnems LSKXhu>isF pgu)bdpziGp ntmmsQPZP blq 



20 
25 



65 
70 
75 
80 
85 



Seq ID KOt 70 D2IA sequence 

Nucleic Acid Accession «: BC006529. 

Coding sequence: 17B-2424 



1 11 21 31 41 51 

I i I I I I 

GGCAGGAG6Q GGACCCG6CC GGTCC6GC6C GAGCCCCCXn' C06G06CCCT G6CT06GCCC 60 

CCAGGTTGGA GGA60C0GGA GCCC6CCTTC GGAGCTACGG CCTAAGGGOG GCGGOGACTO 120 

30 CAGTCTGGAO G6T0CACACT TGT6ATTCTC AATGGA6A6T GAAAAOGCAG ATTCATAATG 180 

AAAACTAGCC CCOGTOGGCC ACTGATTCTC AAAAGACGGA GGCTGCCCCT TCCTGTTCAA 240 

AATGCCCCAA GTQAAACATC AGAOGAGGAA CCTAAGAOAT CCGCTGCCCA ACAQGAGTCT 300 

AATCAAGCA6 AGQCCTOCAA QQAAKSTGGCA 6AGTCCAACT CTT6CAA6TT TCCA6CTGGG 360 

ATCAAGATTA TTAACCACCC CACCATGCCC AACAG6CAA6 TAGTGGCCAT CCCCAACAAT 420 

35 6CTAATATTC ACA6CATCAT CACAGCACTG ACT6CCAAG6 GAAAAGAGAO TG6CAGTA6T 480 

GGGCCCAACA AATTCATCCT CATCAGCTQT GGGGQAGCCC CAACTCAGCC TCCAGGACTC 540 

06GGCTCAAA CCCAAACCAS CTATGATGCC AAAAGGACAO AAGTGACCCT GGAGACCTTG 600 

GGACCAAAAC CTQCAGCTAO QGATGTQAAT CTTCCIAQAC CACCTG6A6C CCTTTG06A6 660 

^ CAGAAA0GG6 AGACCTGTGC AGATCGTGAG GCAGGAQGCT GCACTATCAA CAATAGCCTA 720 

40 TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TG(3GCTCC06 CAGCATCAAG 780 

CAAGAGATG6 AGGAAAAGGA GAA7TGTCAC CTGGAGCAGC GACAGGTTAA GGTTGAOGAG 840 

OCTTGGAGAC CATCAG06TC CTG6CAGAAC TCTGIGTCTG AGCG6CCACC CTACTCTTAC 900 

ATQOGCATGA 7ACAATT0QC CATCAACAGC ACTGAGAGGA AG06CATGAC TTimAAGAC 960 

^ ATCTATA06T GGATTQAOQA CCACTTTOCC TACTTTAAGC ACATT6CCAA 6CCAG6CT66 1020 

45 AAGAACTCCA TCCX3CX3VCAA CCTTTCCCTC CACGACATGT TTGTCOSGGA GACGTCTGCC 1080 

AATGOCAAGG TCTCCTTCTG GACCATTCAC CCCAGTGCCA ACOGCTACTT GACATTGGAC 1140 

CA6GTGTTTA AGCAGCA6AA AC6ACCGAAT CCAGAGCTOC GCCX3QAACAT GAGCATCAAA 1200 

ACOGAACTCC CCCTGGQGGC ACGGOGQAAO ATGAAGCCAC TGCIA CCAOQ GC TCAO CTCA 1260 

^ TACCTGGTAC CTATOCAGTT OCOtSGTQAAC CAGTCAC708 TGTT6CAGCC CTCGGTGAAG 1320 

50 GTGCCATTGC CCCTQGCX30C TTCCCTCATG AGCTCAGAGC TTGCCCGCCA TAGCAAGOGA 1380 

GTCCX5CATTG CCCCCAAGGT GCTGCTAGCT GAGGAGGGGA TAGCTCCTCT TTCTTCTGCA 1440 

GGAGCA6GGA AA6AGGAGAA ACTCCTGTTT GGAGAA6G6T TTTCTCCTTT OCTTCCAOTT 1500 

GAOACTATCA AOGAGGAAGA AATCCAGCCT GQGGAG6AAA TGCCACACTT AGOGAGACCC 1560 

ATCAAA6TG6 A6AGGCCTCC CTT6GAAQAG TGGCCCTCCC GGGCCCCATC TTTCAAAGAO 1620 

55 QAATCATCTC ACTCCTGOOA GGATTCGTCC CAATCTCCCA COCCAAGACC CAAGAAGTCC 1680 

TACAGTGGOC TTAGGTCCCC AACCOGGTGT GTCTCGGAAA TGCTTGTGAT TCAACACAGO 1740 

GAGAGGAGGG AGAGGA6CCG GTCTCGGAGG AAACAGCATC TACTGCCTCX; CTGTGTGGAT 1800 

6AGC0GGAGC T6CTCTTCTC AGA66GGC0C AOTACTTCOC aCTGGOCCOC A6AGCTCCG6 1660 

TTCCCA6CAG ACTCCTCTGA CCCT6CCTCC CAGCTCAOCT ACTCCCAfiSA AGTO0GA6GA 1920 

60 CCTTTTAAGft CACXX»TTAA OGAAACOCTG CCCATCTCCT CCACCCC3GAG CAAATCTGTC 1980 

CTCCCCAGAA CCCXTTGAATC CTGGAG<3CTC ACGCCXXCAG CCAAAGTAGG GGQACTGGAT 2040 

TTCAOCCCAG TACAAACCCC CCAGGGTGCC TCTGACOCCT TGCCTGACCX: CCTGQQOCTO 2100 

ATGQATCTCA GCACCACTCC CTTGCAAAGT GCTOCOOCOC TTGAAIXCACC GCTUUUSGCTC 2160 

CTCAGTTCAG AACCCTTAGA CCTCATCTCC GTCCCCTTTG OCAACTCTTC TOCCTCAGAT 2220 

ATAGACXSTCC CCAAGCXAGO CTCCXXX3GAG OCACAGGTTT CTGGCCTTGC AGC3CAAT0GT 2280 

TCTCTGACAO AAGGCCTGGT CCTGGACACA ATGAATGACA GCCTCAGCAA GATCCTGCTG 2340 

GACATCAGCT TTCCIGGCCT GGAC6AGGAC CCACTGGGCX: CTGACAACAT CAACTGGTCC 2400 

CAGTTTATTC CTGAGCTACA GTAGAGOCCT 0GCCTT60CC CTOT G CTC A A OCTGTCCACC 2460 

ATCCCG6GCA CTCCAAGGCT CAGTGCAGCX: CAAGCCTCTG A6TGAGGACA GCAG6CA6GG 252 0 

ACTGTTCTGC TCCTCATAGC TCCCTGCTGC CTGATTATGC AAAAGTAOCA GTCACACCCT 2580 

AGCCACTGCT GGGACCTTGT GTTCCCCAAQ AGTATCTGAT TCCTCTGCTG TCCCTGCCAG 2640 

GAGCTGAAGQ GTGGGAACAA CAAAGGCAAT GGTGAAAAGA 6ATTAG6AAC CCCCCAGCCT 2700 

tfrn-CC A aiV TCIGC O CAGC AGTCTCTTAC CTTCCCTGAT CTTTQCA0G8 TQGTCCGTGT 2760 

AAATAGTAIA AATTCTCCAA ATTATCCTCT AATTATAAAT OTAAGCTTAT TTCCTTAGAT 2820 

CATTATCCAG AGACTGCCAG AAGGTGOGTA GOATGACCTG GGGTTTCAAT TGACTTCTGT 2880 

TCCTTGCTTT TAGTTTTGAT AGAAOOGAAO ACCTGCAGTG CAOGOTTTCT TCCAGGCTGA 2940 

GGTACCTGGA TCTTGGGTTC TTCACTGCA6 GGAOOCAGAC AAGTQGATCT 6CTTGCCAGA 3000 

GTCC IT TTT G CCCCTCCCTG CCACCTCCCC GTGTTTOCAA GTCAQCTTTC CTGCAAGAAG 3060 

AAATCCTGGT TAAAAAAGTC TTTTGTATTQ 66TC3U3GAGT TGAATTTGGG GIGGGA66AT 3120 

GGATGCAACT GAAGCAGAGT GTGGGTGCCC AGATGTQOOC TATTAGATGr TTCTCTGATA 3180 

ATGTCCCCAA TCATACCAGG GAGACTGGCA TTGAC38AGAA CTCAGGTG6A GGCTT6AGAA 3240 

GGCOGAAAGG GCCCCTGACC TGCCTOGCTT CCTTAGCTTG CCCCTCAGCT TTGCAAA6A6 3300 

CCACCCTAG6 CX:CCAGCT6A CCGCATGGG7 GT6AGCCAGC TT6AGAACAC TAACTACTCA 3360 
ATAAAAGCGA AGGTG6AAAA AAAAAAAAAA AAAAAAA 



Seq ID NOt 71 Protein sequence: 
Protein Accession «: AAE06S29.1 
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1 11 21 31 41 51 

I I I I i I 

MKTSPHRPLI LKKRBLPLFV QNAPSETSES EPKRSPAQQB SNQAEASKEV AESKSCKFFA 60 

GIKIZNBFTM FMTQWAIFH NANIHSIITA LTAKGKBSGS S6PKKFILIS GGGAPTQPPO 120 

LRFQTQTSyD AXRTEVTLBT LGPXPAASOV NLFRFPOALC EQXKETCAS6 BAAGCTHOIS 180 

IiSHIQWUlKM SSDGL6SRSX XQEMSEKaYC RLEQRQVKVB EPSRPSASHQ KSVSERPFYS 240 

YHAKIQFAIK STERKRMTLK DIYTKIEDHF P^FKHIAKPG NKNSIRENL5 LHDMFVRBTS 300 

ANOKVSPWTI HPSANRYtiTL DQVPKQQKRP HPBLRRNMTI KTELPLGARR KMKPLLPRVS 360 

SYLVPIQFFV NQSLVLQPSV KVPLPLAASL MSSELARBSK RVRZAPKVI«L AEEX3ZAPLSS 420 

A6FGKBEKLL F6B6PSPLLP VQTXREEEIQ P6EEMPHLAR PIKVESPPLB EHPSPAPSFK 460 

S&SSKSWEDS SQSPTPSPXK SYSGLRSPTR CVSEKIiVZQB JtERSERSRSR SXQRLLFPCV 540 

DEPELIiFSBG PSTSRWAAEL PFPADSSDPA SQLSYSQBVG GPFICTPIKBT Z^ISSTPSKS 600 

VLFRTPBSWR LTFPAKVGGL DP8PVQTFQG ASDPLPDPIiO IMDLSTTPLQ SAPPLESPQR 660 

LLSSEPLDLI SVPFGNSSPS DIDVPKPGSP &PQVS6IAAN RSLTBGLVIiD TMNDSLSKIL 720 
U>ISPP6LDB DPLGPDKINH SQFXPBLQ 



Seq ID HOt 72 DNA sequence 
Nucleic Acid Accession #: U74612.1 
boding sequence: 176-2583 

1 11 21 31 41 51 

] I I I I i 

GGCAOGAGGG GGACCCG6CC GGTCOSGCX5C 6AGCCCCCGT CCGGGCSCCCT GGCTOQGCCC 60 

CX»G6TTGGA GGAGCCC6GA GCCCGCCTTC GGAGCTAOGG CXITAAOGGOG 6CG60GACTG 120 

CAGTCTGGAG GGTCCACACT TGTGATTCTC AAT6GAGAGT GPAAACGCAG ATTCATAATG 180 

AAAACTAGCC CCCX3T0GGCC ACTGUVTTCTC AAAACSAOGGA OGCTGCCCCT TCCTGTTCAA' 240 

AAT6CCCCAA GTGAAACATC AGAGGAGOAA CCTAAGAQAT CCCCT6CCCA ACAGGAGTCT 300 

AATCAAGCAG AGGCCTCX3VA OQAAGTGGCA GAOTCCAACT CTTOCAAGTT TCCAGCTGGG 360 

ATCAAGATTA TTAACCACCC CACX31TGCCC AACACGCAAG TAGT6GCCAT CCCX3ACAAT 420 

GCTAATATTC ACAGCATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAQ TGGCAGTACrr 480 

GGGOCCAACA AATTCATCCT CATCA6CTGT GGGG6AGCCC CAACTCA6CC TCCAOGACTC 540 

0G6CCTCAAA CCCAAACX3U3 CTATGATGCC AAAAG6ACA6 AAGTQACCCr GGAiSACCrtG 600 

GGACCAAAAC CTGCAGCTAG GGATGTGAAT CTTCCTAGAC CACCTGGA6C CCTTTGOGAG 660 

CAC5AAA0GGG AGACCTGTGC AGATGGTGAG GCAGCAGGCT GCACTATCAA CAATAGCCTA 720 

TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCCG CAGCATCAAG 780 

GAAGAGATGO AGGAAAAGGA GAATT6TCAC CT66A6CAGC GACAGGTTAA GGT TGRGGA G 840 

CCTT06AGAC CATCA60GTC CTG6CAGAAC TCTGT6TCT6 A6C36GCCACC CTACTCTTAC 900 

ATOSCCArGA TACAATTCGC CATCAACAOC ACTGAGAGGA AGOGCATGAC rTTCAAAGAC 960 

ATCTATACGT GQATTGAGGA CCACTTTCCC TACTTTAAGC ACAITGCCAA GCC31GGCTGG 1020 

AAGAACTCCA TCOGCCACAA CCTTTCCCTG CACGACATGT TTCTCXXSGGA GAOyrCTGCC 1080 

AAT6GC3UU3G TCTCCTTCTG GACCATTCAC CCCA6TGCCA ACG8CTACTT GACATTG6AC 1140 

CAGGTGTTTA AGCCACTGGA CCCAGGGTCT CCAOVATTGC CCGAOCACTT GGAATCAGAO 1200 

CAGAAACGAC OGAATCCAGA GCTCC3SCC3GG AACAT6ACCA TCAAAACCGA ACTCCCXXTG 1260 

GGCGCACGGC GQAAGATGAA GCCACTGCTA CCAOaGOTCA GCTCATACCT GGTAOCTATC 1320 

CAGTTCCOGG TGAACCAGTC ACTGGTGrTG CAGCCCT06G TGAAGGTGCC ATTGCCCCIG 1380 

GGG6CTTCCC TCATQAGCTC AGAGCTTGCC 06CCATAGCA AOOGAGTCCG CATT60CCCC 1440 

AAGGTTTTTG 60GAACAGGT 06TGTTT6GT TACAT6AGTA AGTTCTTTA6 TGG08ATCTG 1500 

OGAGATTTTG 6TACACCCAT CACCAGCTTG TTTAATTTTA TCTTTCTTTG TTTATCAGTG 1560 

CTGCTAGCTG AGGAGGGGAT AGCTCCTCTT TCTTCTGCAG GACCAGGGAA A6A06A6AAA 1620 

CTCCTGTTT6 GAGAAG66TT TTCTCCTTTO CTTCCAGTTC AGACTATCAA 6GA06AAGAA 1680 

ATOCAOOCTG OOOAOGAAAT GCCACACTTA G06AGA0CGA TCAAAGTOGA GA600CT0CC 1740 

TTGGAA6AGT G6CCCTCCCC G6CCCCATCT TTQlftAGAGG AATCATCTCA CTCCTGG6AG 1800 

GATTOGTCCC AATCTCCCAC CCCAAGACCC AAGAAGTCCT ACAGTGGGCT TAGGTCOCCA 1860 

ACCCGGTGTG TCTCGGAAAT 6CTTGTGATT CAACACAGGG AGAGGAGGGA 6AGGAGC0GG 1920 

TCT06GAGGA AACA6CATCT ACTGCCTCCC TGT61X3GATQ AS006GAQCT 6CTCTTCTCA 1980 

GAGG600CXA GTACTTLXXS CTGGOCOQCA OAGCTCGCGT TCCGAGGAGA CTCCTCTffllC 2040 

CCTGGCTCCC AOCTCAOCTA CTCCCAG6AA GTGG6A66AC CTTTTAAGAC ACCCATTAAG 2100 

GAAAC6CTGC CCATCTCCTC CACCCCGAGC AAATCTGTCC TOXCAGAAC CCCTGAATCC 2160 

TGGAGGCTCA CGCCCCCAGC CAAAGTAGGG GGACTGGATT TCA6CCCAGT ACAAACCTCC 2220 

CAGGGTGCCT CTGACCCCTT GCCTGACCCC CTGGGGCTGA TGGATCTCAG CACCACTCCC 2280 

TTGCAAAGTG CTCCCCCCCT TGAATCAOOS CAAAG6CT0C TCAGTTGAGA ACCCTTAGAC 2340 

CTCATCTCCO TC0CCTTTG6 CAACTCTTCT CCCTCAQATA TAGACGTCCC CAAGCCAGGC 2400 

TCCCCGGA6C CACAGGTTTC TGGCCTTGCA GCCAATCGTT CTCTGACAGA AGGCXntXSTC 2460 

CTGGACACAA TGAATGACAG CXrTCAGCAAG ATCCTGCTGG ACATCAGCTT TCCTQGCCTG 2520 

GAGGAGGACC CACTGOGCCC TGACAACATC AACTGGTCCC AGTTTATTCC TGAGCTACAO 2580 

TAGA800CT6 CGCJTOCCOC tGTGCTCAAG CrOTCGACCA TOCGGGGCAC T0CAA6GCTC 2640 

AQTGCACCCC AAGCCTCTGA OTQAGGACAG CA06CAG6(» CTGTTCTGCT CCTCATAGCT 2700 

CCCTGCTGCC TQATTATGCA AAAGTAGCAG TCACACCCTA GCCACTGCTG GGACCTTGT6 2760 

TTCCCCAA6A GTATCTGATT CXrTCTGCTGT CCCTGCCAGG AGCTGAAGGG TGGGAACAAC 2820 

AAAG6CAATG GTGAAAAGA6 ATTAGGAACC OCGCAGCCTG TTTGCATTCT CTGCCCAGCA 2880 

GTCTCTTACC TTCCCTGATC TXTGCA068T GGTCOGTGTA AATABTATAA ATTCTCCAAA 2940 

TTATCCTCTA ATTATAAATG TAAGCTTATT TCCTTAGATC ATl'ATCCAGA GACT6CCASA 3000 

AGGTGGGTAG GATGACCTGG GGTTTCAATT GACTTCTGTT CCTTGCTTTT AGTTTTGATA 3060 

GAAGGGAAGA CCTGCAGTGC AOGGTTTCTT CCAGGCTGAG GTACCTGGAT CTTGGGTTCT 3120 

TCACTGCAGG GACCCAGACA AGTGGATCTO CTTGCCAGAG TCCTTTTTGC CCXTTCCCTGC 3180 

CACCTCCCC6 TGTTTCC3UU3 TCAGCTTTOC TGCAAfiAAGA AATCXrTGGTT AAAAAAGTCT 3240 

TTTGTATTGG GTCAGGASTT 6AATTTG6G0 TQGGAGGATG GATGCAACT6 AASCA6AGT6 3300 

TGGGTGCCCA GATGTGCGCT ATTAGATGTT TCTCTGATAA TGTCCCCAAT CATACCAGGG 3360 

AGACTGGCAT TGACGAGAAC TCAGGTGGAG GCTTGAGAAG G0CGAAAG6G CCCCTGACCT 3420 

GCCTGGCTTC CTTAGCTTGC CCCTCAGCTT TGCAAAGAGC QlOCCTAGGC CCCA6CTGAC 3480 

OGCATGGGTG TGAGCCAGCT 7GACSAACACT AACXACTCAA lAAAAGGGAA GGTOQACAAA 3540 
AAAAAAAAAA AAAAA 

Seq 10 NOx 73 Protein sequence: 
Pzotein Accession S: AAC51126.1 
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PCT/US02/12476 



1 11 21 31 41 51 

I I I i I t 

MKTSPRRPLI LKSRRIiFLPV QNAPSSTSEB EPiOlSPAQQE SNQAEASKEV ASSNSCRFPA 60 

S GIKZXNKFTM FSTQVVAIPfi NANIHSZZTA LTAKGKBSG5 SGPNKFILZS OGGAPTQPPG 120 

lAPOTOTSYD ARRTEVTLET XiGPKPAAHDV HLPRPPG21LC EQRRETCADO BMGCTZSKS 180 

LSNIQHLRKM SSD6LGSRSI RQEMBEREHC HLEQRQVKVE EPSRPSASWQ NSVSERPPYS 240 

YMAMIQFAIN STERKBMTLK DIYTWIEDHP PYFKHIAKPG HKHSIRHmtS LHDMFVRETS 300 

ANGKVSPIfTI HPSANRYLTL DQVFKPIDPG SPQLPEHLES CXJKRPNPBLR RNMTIKTELP 360 

10 LGARRKNKPL LPRVSSYLVP IQFPVNQSLV LQPSVRVPLP IAASLM5SEL ARHSKRVRZA 420 

FXVFGEQWF GYKSKFFSGD IiBDFOTPZTS LFNFIFIiCLS VLLAEBGZAP LSSAGPGKEB 480 

KLLpGEGFSP LLPVQTIKEE BIQPGEEMPa LARPIKVESP PLBEtfPSFAP SFKEESSE5W 540 

EDSSQ5PTPR PKKSYSGLRS PTRCVSSa.V IQHRERRERS RSSSKQHLLP PCVDEPBLLP 600 

SEGPSTSRWA ABLPPPADSS DPASQLSYSQ EVGGPFKTPI KETLPISSTP SKSVLPRTPB 660 

15 SWRLTPPAKV 6GLDFSFVQT SQ6ASDPLPD FL6LMDLSTT PLOSUPPhBS PQRLLSSEPL 720 

DXiZSVPFGtIS SPSDIOVPKP GSPBFQVSGL AANRSLTBGIi VUmoiDSItS KIZjU)ZSFPG 780 
XjSEDFUjFDN INWSQFIPEL Q 



Seq ID HO J 74 DNA sequence 
20 Kudeic Acid Accession St Eos sequence 
Cofiing sequence: 111-416 



1 11 21 31 41 51 

GGGAAGAGCC AGGCTGAGCC TTATAAAfSGA CTGCTCTTTG TCCAAACACA CACATCTCAC 60 

TCATCCTTCT ACTOSTGACG CTTCCCAGCT CiX ^mTX ' GAAAGCAAAO ATGAGCAACA 120 

CTCAAGCIGA GAGG1XXATA ATAGGCATGA TCC5ACATGTT TCACAAATAC ACCAGACGTG 180 

ATGACAAGAT TGAGAAGCCA AGCCTGCTGA OSATGATGAA GGAGAACTTC CCX3VACTTCC 240 

^ TTA(3TGCCTG TGACAAAAAG GGCACAAATT ACCTCtSCOGA TGTCTTTGAG AAAAAGGACA 300 

30 AGAATX3AGGA TAAGAAGATT GATTTTTCTG AGTTTCTGTC CTTGCTGGGA GACATAGCCA 360 

CAGACTACCA CAAGCAGAGC CATGGAGCAG CGCCCTGTTC OGGGGGCAGC CAGTGACCCA 420 
GCCCCACCAA T66GCCTCCA GAGACCCCA6 GAACAATAAA ATGTCTTCTG CCACCAGA 

. Seq ZD NO: 75 Protein sequence; 
33 Protein Accession Eos sequence 

1 11 21 31 41 51 

I I t I I I 

MSNTQAERSI ZGMIDKFBKy TRRDDKZEKP SUiTMMKEWF PNFLSACDKK GTNYZADVFB 60 
40 KKDKNEDKKZ DFSEFLSLLG DZATDYBKQS BGAAPCSG6S Q 

Seq ZD HOf 76 DNA sequence 
Nucleic Acid Accession S: Bos sequence 
45 Coding sequence: 111-416 

1 11 21 31 41 51 

1111 I I 

GGGAAGAGCC AGGCTGAGCC TTATAAAGGA CI ' G C rCmtS TCCAAACACA CACATCTCAC 60 

50 TCATCCTTCT ACTOSTGACA CTTCCCAGTT CTGGCTTTTT GAAAGCAAAG ATGAGCAACA 120 

CTCAAGCTGA GAGGTCCATA ATAGGCATGA TCGACATGTT TCACAAATAC ACCBGACGTG 180 

ATGGCAAGAT TGAGAAGCCA AGCCTGCTGA CGAT6ATQAA GGAGAACTTC CCCAATTTCC 240 

TCAGTGOCTG TGACAAAAAG 60CATACATT ACCTCGOCAC TGTCTTTGAG AAAAAGGACA 300 

^. AGAATGAOGA TAAGAAGATT GATTTTTCTG AGTTTCTGTC CTTGCTGGGA QACATAQCCG 360 

55 CAGACTACCA CAAGCAGAGC GATGGAGOGG COCOCTGTTC TGGGGGAAGC CAGTGATCCA 420 
GCCCCACCAA GGGGCCTCCA 6AGACCCCAG GAACAATAAG TGTCTCCTGC CACCAGA 



60 
65 
70 



Seq ZD NO: 77 Protein sequence: 
Protein Accession #s JCP_048124.l 

1 11 21 31 41 51 

I I I I t i 

KSNTQAERSZ ZGKZDMFBinr TGRDGKZSKP SLLTMMKEHF PNFLSACDKK GIBYZATVFB 60 

RKDKNEDKKZ DFSEFLSIiLG DZAADYHKQS BGAAPCSGGS Q 

Seq ZD NO: 78 DKA sequence 
Nucleic Acid Accession «t 273678.1 
Coding sequence: 253-2433 



1 11 21 31 41 51 

I I I I I I 

QGG G TG G TQC AGGGCAGGGG T6GTATATCC TGTCTGACG6 AQGGOGQGOC TCQCGAOTGC 60 

CAGAGAG66A 06AACCAGGG T(S3AAGC6CC AGGAGCAGCT GCAGGGAGCC CTCAGOGGGA 120 

75 CCTOGCACTC TATGGCCGTA GGGAGCOGCT GAGAOCGAGA AGAQCAOGCT CCTGCCCGCC 180 

OGCTGCACCG CACCTC6CCT CGCCTCrCTO CTCTCCTAGQ CCCCGGCOGC GCGCCACCOG 240 

CCTCOOGCCA CCATGAACCA CT06CCGCTC AA6ACCGCCT TG60GTA0GA ATGCTT0CA6 300 

GAOCAOGACA ACTCCAOGTT GGCTTTGC06 TCGGACCAAA A6AT6AAAAC A6GCAGGTCT 360 

6GCAGGCAGC GCGTGCAGGA GCAGGTGAT6 ATGAOGGTCA AG0G6CAGAA GTCO^AGTCT 420 

80 TCCCAGTCGT CCACCCTGAG CCACTCCAAT CGAGGTTCCA TGTAT6ATGG CTT6GCTGAC 480 

AATTACAACT ATG6GACCAC CA6CAGGAGC AGCTACTACT CCAAGTTCCA GGCAGGGAAT 540 

GGCTCATGGG GATATCOGAT CTACAATGGA ACCCTCAAGC GGGAGCCTGA CAACAGG06C 600 

TTCAGCTCCT ACAGCCAGAT 0GAGAACT6C AGCCG6CACT ACCCCCGGGG CAGCTGTAAC 660 

ACaiCOGGOG CAG6CAGCGA CATCTGCTTC ATGCAGAAAA TCAAGGCGAG OOGCAGTGAG 720 

85 CCOGACCTCT ACTGTGACCC ACGGGGCACC CTGCGCAAGG QCAOSCTGGG CAGCAAGGGC 780 

CAGAAGACCA CCCAGAACCG CTACAGCTTT TACAGCACCT GCAGTG6TCA GAAGGCCATA 840 

AA6AAGTGCC CTGTGOOCCC GCCCTCTTGT GCCTCCAAGC AGGACCCTGT GTATATCCOG 900 
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CCCATCTGCr GCAACAAOGA CC ltf lXX Vl ' T G6CCACTCTA GGGGCAGCrC OMGATCIGC 960 

A0TGAG6ACA T06AGTGCAG TGGGCTCACC ATOCa3UU36 CTGTOCAOfTA CCTSAGCTCC 1020 

CAGGATGAGA AGTACCAGGC CATTOGGGCC TATTACATCC AGCATACCTG CTTCCAGGAT 1080 

GAATCTGCCA AGCAACAGGT CTATCAGCTO GGAGGCATCT GCAAGCTGGT GGACCTCCTC " 1140 

CX3CAGCCCCA ACCAGAA05T CCAGCAGGCC GOCSGCAOGGG OCCTGOGOUl CCTGGTGTTC 1200 

AGGASCAOCA CCAACAAGCT G6ASACC06Q AGGCAGftATG 6GATOC3GOGA G6CAGTCAGC 1260 

CTGCT6AGGA GAAC0GG6AA G6C0GAGATC CAGAAGOUSC imciGGQCT GCTCTGQAAC 1320 

CTGTCrrOCA CTGAOGAGCT GAAGGAGGAA CTCATTGCOG AOOCCCTGCC TGTTCTQGCC 1380 

GACCGCX3TCA TCATTCXCTT CTCTGGCTGG TGOSATGGCSV ATAOCAACAT GTCCOGGGAA 1440 

GTG6TGGACC CTGA66TCTT CTTCAATGCC ACAGGCIGCT TGAGGAACCT GA6CTCG6CC 1500 

GATGCftOGOC GCCAGACXAT GOGTAACTAC TCAOG6CTCA TT6ATTG0CT CATGGCXrTAT 1560 

GTOCAOAACT 6TGTAGC06C CAGCO G CTGT 6A0QACAA6T CTGTGGAAAA CIGCATOrGT 1620 

GTTCTGCACA ACCTCTCCTA COGCCTGGAC GCOGAGGTGC CCACCCGCTA CC6CC31SCTG 1680 

GAGTATAAOG CCCGCAAOGC CTACACCGAG AAOTCCTCCA CTOGCTGCTT CAGCAACAAG 1740 

AGOQACAAGA T6ATGAACAA CAACTATGAC TG0C0CCT6C CrSAC38AAGA 6ACCAACCCC ISOO 

AAGGGC3UX36 GCTGGTTGTA CXATTCAGAT 00CAT0C6CA CCTACCTGAA GCTCATG6GC 1860 

AAGAGCAAGA AA6ATGCTAC CCTGGAGGCC TGTGCTGGTG OCTTGCAGAA CCTGACAGCC 1920 

A6C3U«»;0GC TGATGTCCA6 TGGCATGAGC CAGTTGATTG GGCTGAAGGA AAAGG6CCT0 1980 

CCACAAATTG CCOGCCTCCT GCAATCTGGC AACTCTGATG TGGTGOGGTC OGGAGCCTCC 2040 

CTCCTGAGCA ACATGTCCCG CCACCCTCTG CTGCACAGAG TGATGGOGAA CX3VGGTGTTC 2100 

CCGGAGGTGA CCAGGCTCCT CACCAGCCAC ACTGGCAATA CCAGCAACTC GGAAGACATC 2160 

TTGTCCTOGG CCTGCTACAC TGTGAGGAAC CTGATOGCCT OGCAGCCACA ACTGOCCAAG 2220 

CAGTACTTCT CCAGCAGCAT GCTCAACAAC ATCATCAACC TGTGCCGAA6 CAGTGCCTCA 2280 

CCCAAGGCCG CAGAAGCTGC CCGGCTTCTC CTGTCTGACA TGTGGTCCA6 CAAGGAACTQ 2340 

CAGGGTGTCC TCAGACAGCA AGGTTTCGAT AGGAACATGC TGOGAACCTT A0CTG6GGCC 2400 

AACAGCCTCA GGAACTTCAC CTCCCXATTC TAAGAAGAGA CTGTCCAAGC A AGTTA OGCT 2460 

TGCAGGAAGA TATGACCCAG CTGAGAAGCC CTCAGGCCTC GCTGGATGGG GTTTTCTGTC 2S20 

CATCCTGTGC AGTATTTGGG AAAGTTCACA AGAAACTGAG AAGAAACCTA AAAACTGTGG 2580 

ATAGTGQAAA GATTTTTAGA rmTm ' i i ' CCTTGGG6AA ACTGGCAGGC AATGGGGGTT 2640 

A6GGAGGTTG GG60GGGGGG GGCTTTCTTG AGTTAAAGGG OCITATATGT GATGTCAATA 2700 

TTTCTTCCTC TGAGAAATGG TATATATATG TGTCTAATGT AAOTGTGTGC ATGCATGTGC 2760 

GOGTGCATGT GTGTGTGTGT QAGTGTCTTA AAGCAtAACC ACAAACTGCA AAAAGCTAGG 2820 

TAA6CTATTT TGrTGCAGCT CATAAQGrGO TGAAAAG6AC TCTCCTGTGT TTCTTACTCA 2880 

TAOGCAAOGA CAACATGTGC TTTTTGGTGA GCTGCTCATA ATTCCTGAAA TGTGTGGTGC 2940 

CAGGGCAAOO GSOCCATCAC TGCAGTCAGG CCCTO^GtiGG AGTCCTGCAO OCTTCCTACC 3000 

AGTGGTCTCC AAGGGTGCAG GAGTAACTGG GGCTGGGCCA GCCTCCCCCC TTACAAGGCT 3060 

GCTTTCCACG AAGGGAGGTC TGGTGTATCT CATGGGAGAA TCTGGGGTGT CTGTAGT6TC 3120 

ACGCCTCCA6 CAG06CCACA AOGACIGAGO TTGGGtAGGT GT6AGGTTCC AGAGGACAGC 3160 

AGGACACTCT 09CATACTTT GCCAAAIGA6 GGCT6CTCA6 AOGAGTAGGA GCTSAAAGAT 3240 

GGTGCCTTCC ACCCTCTTGG GCTGTGTGCX: CATCASAGCA GGCTCAGCCT GCAAAGGCCC 3300 

TGCATTCAGA GGTCTTGTAA TCTACTTGTT GCAGGAGAAA GAAOGTAAAA AATGATTTTT 3360 

TTAAGAAAAG CTATTTTATT GCAGCTCTTT CCCAAGA3CT GTTCTGGGAA TGGCTGGTCT 3420 

TCATATTCCC AGTGGA6AGG GGAACAAOTO GGGCI06GCA TATACCTATT CCX3GCTTCTA 3480 

8TG6GATGGA GTTGGQGTAT AGAAATTAAC CAGGAAGATO TTTCCACCAA GCCTGCTGTG 3540 

AGTCAATTGA GGOAGTGTTT GGGTCCCAGO AGACTT6GAC GGGGGGAGTT TGGGTA6ACT 3600 

AGGAAAGGAA AGTCCXyiTAT CAGGGTACXX3 GTACCGGCAA GCTCACATCT CAGCCAGOGG 3660 

CCATGCCCCA CTTCCCCTGA CCCCAGCTGT CTTGTCTCCA CTCTGTGAAA OCCACAGGGO 3720 

ATGTGATAAA CAOGQCTATT AOGGGTATCA GCCAOOTOGA GCCCCCAGAC TCTGT6CACT 3780 

TCAGACCAGC AGCAGCAGGA OGGCTCCOGA GGGOCTTATG AGAAAACCTG TGTGGACATC 3640 

CCTTGGTGTA CACTAAGACA GAGCAGAGCC CAGCX3CTCCC AAGCCTTCCT CCTTCCAGCT 3900 

TCTACCTCCA TGCTAGCATT GCTGGTGTTA GAGAGGAATT AACTTCCTGG TCTGTGCCCT 3960 

TCTCTAGAAG AATATAAGAT GCTCCTCCTC CTCACCCCTT CTCAOCCTCC TCCCAAQTCT 4020 

TCCTCTTCTG CAOCACCCCC GAGTCCAAAC CCACCTCTtO CCOCAGCATT CAGGCTGGAA 4080 

AACACTGA7G TGGACTCAGT ATGACAACTG AGATGGGGGA AGCCAGACAT GTGAGQAGGC 4140 

TGTCCTCC3GA GAGGTGTCCC GOGCTOTTAO CCAGCTGTGC TGTGGTGCTG TGGGTCTGTC 4200 

ATACCCTCCC TTGCTTCTGT TCACACTGGG AGGCCCACTC CT6GCTC3VCC TCTCCCTCTC 4260 

AOOQACCCAC GTGGGAGCCT GGATCCCTGG A C TGTCCTGG GCATAGGTTT CAGGGGCCTC 4320 

CTTTGTTGTC ATCAGAACOC AGAGGAATTC TTCTOCTAAA AAATACGTAT OGCATACCAA 4380 

TCTGTGCGGG GCAGTOTCCT AA6CACTTA6 ACTACATCAG GGAAGAACAC AGACCACATC 4440 

CCCGTCCTCA TGOGGCTTAT OTTTTCTGGA GGAAAGTGGA GACACAAGTC CrTGGCTTTA 4500 

GGGCTCCCCC GGCTGGGGQC TGTGCAGTCX: GGTCAGGGOG GGAGGGGAAA TGCACOGCTG 4560 

CATGTGAACC TTACCA6CCC AGGCGGATGC CCCTTCCCCT TAGCACTACC CTGGCCTCCT 4620 

GCATGOCCTC GC CT CA TGTT CCTCC m CCT TCAAAGAATO AAGAGCOOCA T06GCCCAGC 46S0 

GCCTGCCCT6 G6AAGCAGGC AGCXnTTOCAG ACCTCAGGGG CTGAGGCAGA CTATTA66GC 4740 

AGGGCTGACT TTGGTGAC3VC TGCTCATTOC CTCTCAGGCC AGCTCAGGTC AOCCGGGCCT 4800 

CTGACCCAGG CCTGTCACTT TGAGAGGGGC AAAACTGAGA GGGGCTTTTC CTAGAGAAAG 4860 

AGAACAA6GA GCTT6CCA6G CTTCATGTA6 CCGACACAOG TCTCAGGATT TTAAGTCX31C 4920 

ATTGOCCTGA CACTAGCCTA OGCCAATGOC CAAAATAAGQ AGTTCCAATT TGGGGCCAAA 4980 

TGAOGAAGGA CACAGACTCT GCCCTGG6AT CTCCT0T 6 CT AG006GCAAT GACAAATCCA 5040 

GTCATTGGCC ACCAGCCACC TCTGCAGTGG GGAOCACACT AOOWSCCCTG ACTCCACaCT 5100 

CCTCCTGGGG ACCXS^GAGG CAGTGTTGCT GTCTGOGTGT CCACCTTGGA ATCTGGCTGA 5160 

ACTGGCTGGG AGGACCAAGA CIGOQGCTGG GGT6Q6CAG6 GAAGG6AAGC GG66GGCTGC 5220 

TGTQAGQGAT CTTGQAGCTT CCCTGTAGOC CAOCTTCCCC TTGCTTCATG TTTGTAaAGO 5280 

AACCTTGTGC 0GGCCAG6OC CAGmCCIT GZGTQATACA CrAATGZATT TGCTTXTTTT 5340 
GGAAATAGAG AAAATCAATA AATTGCTAGT GTTTCTTT6A AAAAAAAAA 

Seq ID NOs 79 Protein seqaexice: 
Protein Accession CAA9B022.1 

1 11 21 31 41 51 

1 I I 1 I ! 

KNHSPLICrAL AYECFQDQCN STLALPSDQK MKTGTSGRQR VQBQVMMTVK RQKSKSSQSS 60 

TLSSStmaSM YDGLAZSTYNY GTTSRSSYYS KFOAGNGSWG ypIYMGTIiXS EPDNSUFSSY 120 

SQMBNWSKHY PIU3SCNTTGA 6SDXCFMQKI KASRSEPDLY CDPRGTLRXG TLGSKGQKTT 180 ' 

QHRYSFYSTC SGQKAIKKCP VRPPSCASICQ DPVYIPPISC NKDLSFGHSR ASSKICSEDI 240 

BCSGLTIPKA VQYLS8QDER YQAIGAyYIQ HTCFQDESAK QQyYQLGGIC KLVDLLRSPN 300 

QKVQQAAAGA LRHLVFRSTT SKUBTRRQNG IREAVSLLR& TQDVEIQKQL TGLLMKLSST 360 
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DSLKEELIAD ALFVLAZSRVZ IPFSGWCDGN SMMSRBWDP EVFFNATQCEi RNLSSAQAGR 420 

QTMRHYSGLI DSLMAYVQKC VAASROIDXS VEHOtCVLHN LSYSLDABVP TRYRQLByilA 480 

HHAYTEKSST 6CPSNKSDXM MNNNYDCPIiP EEET17FKBSG WLYHSEAIRT YLKLM6KSKK 540 

DATLEACAGA LQtTLTASKGI* MSSOISQLIG LKEKGLPQIA RLLQSGNSDV VRSGA5LLSK 600 

KSSBPUiBRV MGNQVFPBVT RLLTSRTOIT SNSEDILSSA CYTVR23LMAS QPQLAKQYFS 660 

SSKLNKIINL CRSSASPKMl EAABLLLSDM WSSKELQGVL RQQGFDRNMIi GTLAOANSLR 720 
NFTSRP 



Seq ID NO: 60 DNA sequence 

llucleic Acid Accession ft: HM_006SI6.1 

Coding sequences 1B0<-1658 

1 XI 21 31 41 51 

I I t I I I 

TAGT0GCSG6 TCCCC6AGT6 A6CAC60CAG GGAGCAGGAO ACCAAAOQAC GaGQGTOQQA 60 

GTCAGAGT06 CAGTG6GAGT CCCOGGACGG GASCA06AGC CTGA6G66GA GA6C6CC6CT 120 

CGCACGCCCG TCGCCACCCG CGTACCOGGC GCAGOCAGAQ CCACCACCGC AGCGCTGCCA 180 

TGGAGCCCAG CAGCAAGAAG CTGAOGGGTC GCXTTCATGCT GGCTGTGGGA GGAGCAGTGC 240 

TT66CT0CCT GCAGTTT6GC TACAACACT6 GAGTCATCAA TGCCCCCCAG AAG6TGAT0G 300 

AGGAGTTCtA CAAGGAGACA TGGGTOCACC GCIATGGGGA GAGCATOCTG CCCACCAOQC 360 

TCACCaOGCT CTGGTCOCTC TCAGTGGCCA TCTTTTCTGT T GG GGGCATG ATTGGCTCCT 420 

TCTCTGTGGG CCTTTTCGTT AACCGCTTTG GCCGGOGGAA TTCAATGCTO ATGATGAACC 480 

TGCTGGCCTT CGTGTCCGCC GTGCTCATGG GCTTCTOGAA ACTGGOCAAO TCCTTTGAGA 540 

TGCTGATOCT GGGOCGCTTC ATCATOCSGTG TGTACTGOSG CCTGACCACA GGCTTCGTGC 600 

CCATGTATGT GQGTGAAGTG TCACXX3VCAG CCTTTOGTGG GGCCCTGGGC ACCCTGCACC 660 

AGCTGQGCAT CGTC3GT0GGC ATCCTCATCX3 CCCAGGTGTT CGGCCTGGAC TCCATCATGG 720 

GCAACAAGGA CCTGTGGCCC CTGCTGCTGA GCATCATCTT CATCCCGGCC CTGCTGCAGT 780 

GCATCGTGCT GCCCITCTGC CCCGAGA6TC CCC6CTTCCT OCTCATCAAC CGCAACGAGG 840 

AGAACC3GG6C CAAGAGTGTG CTAAAGAAGC TG06CQGGAC AGCTGA05TG ACCCATGACC 900 

TGCAGGAGAT GAAGGAAGAG AGTCGGCAGA TGATGQOGGA GAAGAAGGTC ACCATCCTGG 960 

AGCTGTTCOG CTCCCCOSCC TACOGCCAGC CCATCCTCAT OGCTGTOGTO CTGCAGCTCT 1020 

CCCAGCAGCT GTCTG6CATC AACGCTGTCT TCTATTACTC CACGAGCATC TTCGAGAAGG 1080 

CGGGGGTGCA GCAG0CT6TG TATGCCACCA TTGGCTCCGG TATOGTCAAC ACX^GCCTTCA 1140 

CTGTCGTGTC GCTGrTTGTG GTGGAGOGAG CAGGCCQGOG GACCCTGCAC CTCATAGGCC 1200 

TOGCTGGCAT GG06GGTTGT GCCATACTCA TGACCATCX5C GCTAGCACTG CTGGAGCAGC 1260 

TACCCTGGAT GTCCTATCTG A6CATCGTGG CCATCTTTGG CTTTGTGGOC TTCTTTGAAG 1320 

TGGGT0CT3Q OCCCATCCCA TGGTTCATOG TG6CTGAACT CTTCAGCCAG GGTCCAOGTC 1380 

OVGCTGCCAT TGCOGTTGCA G6CTTCTCCA ACTGGACCTC AAATTTCATT GT6GGCATGT 1440 

GCTTCCAGTA TGTGGAGCAA CTGTGTGGTC CCTAOQTCTT CATCATCTTC ACTGTGCTCC 1500 

TGGTTCTGTT CTTCATCTTC ACXTTACTTCA AAGTTCCTGA GACTAAA08C CGGACCTTCG 1560 

ATGAGAT06C TTCX:G6CrTC CGGCAGGGQG 6AGCCAG0CA AA6TGATAAG ACACCCGAGG 1620 

A8CTGTTCCA TCC0CT0G66 GCT6ATTCCC AAGTGT6AGT G6C0CCAGAT CACCAGCCOB 1680 

GCCTGCTCCC AGCAGCCCTA A6GATCTCTC AGGA&CACAG GCA6CTGGAT GAGACTTCCA 1740 

AACCTGACAG ATGTCAGCOG AGCCGGGCCT GGGGCTCCTT TCTCCAGCCA GCAATGATGT 1800 

CCAGAAGAAT' ATTCAGGACT TAACGGCTCC AGGArTTTAA CAAAAGCAAO ACTGTTGCTC 1860 

AAATCTATTC AGACAAGCAA CAGGTTTTAT AATZTTTTTA TTACTGATTT TGTTATTTTT 1920 

ATATCAGCCT QAOTCTOCTG TGCCCACATC CCAGGCTTCA CCCTGAATGO TTCCATGCCT 1980 

GAGOQTOGAO ACTAAGCCCT GT0GA6ACAC TTGCCTTCTT CACCCASCTA ATCTGTAOGO 2040 

CTGGACCTAT GTCCTAAGGA CACACTAATC GAACTATGAA CTACAAAGCT TCTATCCCAG 2100 

GAGGTGGCTA TGGCCACC06 TTCTGCTGGC CTGGATCTCC CCACTCTAGG GOTCAGGCTC 2160 

CATTAGGATT TQCOCCTTCC CATCTCTTCC TACCCAACCA CTCAAATTAA TCTTTCTTTA 2220 

CCIQAGACCA GTTGGQftGCA CTGGAGT6CA G6GAGQAGA0 GG6AAGQ00C AGTCTGGQCT 2280 

GCCOOGTrCT AGTCTCCTTT 6CACTGAGGG CCACACTATT ACCATGAGAA GA6G6CCTGT 2340 

GGGA6CCT6C AAACTCACTG CTCAAGAAGA CATGGAGACT CCTGCCCTGT TGTGTATAOA 2400 

TGCAAGATAT TTATATATAT TTTTGGTrGT CAATATTAAA TACAGACACT AAGTTATAGT 2460 

ATATCTGGAC AAGCCAACTT GTAAATACAC CACCTCACTC CTGTTACTTA CCTAAACAGA 2520 

TATAAATQ6C T GG TITTTAG AAACATGGTT TTGAAATOCT TGT0GATT6A GG6TA6GA6G 2580 

TTTGGATGG6 AGTGAGAGAG AAGTAAGTG6 GGTTGCAACC ACTGCAA066 CTTAGACTTC 2640 

GACTCAGGAT CCAGTCCCTT ACAOGTACCT CTCATCAGTG TCCTCTTGCT CAAAAATCTG 2700 

TTTGATCCCT 6TTACCCA6A GAATATATAC ATTCTTTATC TTGACATTCA AGGCA3TTCT 2760 

ATCACATATT TGATAGTTGG TGTTCAAAAA AACACTAGTT TTGTGCCA6C CGTGATGCTC 2820 
A6QCTT6AAA TCGCATTATT TTGAATGTGA A6G0AA 

Seq ID NO: 81 Protein eequence: 
Protein Accession #: HP_006507.l 

1 11 21 31 41 51 

I I I I I I 

MEPSSKKLTO RLMLAVGGAV LGSLQFGYHT GVINAPQKVI EEFYNQWVH RYGESILPTT 60 

LTTliWSLSVA IFSVGGMIGS PSVGLFVNRF GRRKSMUMN LLAFVSAVLM GPSKLGKSFE 120 

MLIX^GRFIXG VYCGLTT6FV PMYVGEVSPT AFRGALGTLH QLGIWGILI AQVFGliDSIM 180 

GHRDLifPLLL SIIFXPALLQ CIVLPPCPES PRFLX«ZNSNE EKRAKSVLKK LRGTADVTHD 240 

LQEMiCEESRQ MMREKKVTIL ELPRSPAYRQ PZLIAWLQL SQQLSGIIIAV FVySTSIPBR 300 

AGVQQPV^T IGSQIVHTAP TWSLFWER AGRRTLHLIG LAOIAGCAIL MTIALALLKQ 360 

LPHMSYLSIV AIPGFVAPFE VGPGPIPWFI VABUSQGPR PAAIAVAGPS NWTSNPIVO« 420 

CPQYVBQL06 PYVPZIFTVL LVLFFZPTYF KV PKTKGHT P DBIASGFRQG GASQSDKTPB 480 
BLFEPLGADS QV 

Seq ID NO: 82 OKA sequence 
Nucleic Acid Accession §: BC001291 
Coding sequences 44*>541 

1 11 21 31 41 51 

1 I i I I I 

OGGGGOSCOG OGOGCTGACC CTCCCTGGGC ACCGCTGGGG AOGATGGOGC TGCTOGCCTT 60 

GCTGCTGGTC 6TG0C0CXAC OGOGGGTGTG GACAGAOSCC AAGCTGACTG CGAGACAAOQ 120 
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AGATCCAGA6 GACTCCCAGC GAA0G6A0GA GGGIGACAUT AGAGTGTGGT 
TGA6AGAGAA AACACTTTOG AGTGCCAGAA OCCAAGGAGG TGCAAATGGA 
CTGOSTTATA GOGGCCGTGA AAATATTTCC AOGTTTTTTC ATGGTTGCGA 
CGCTGGTTGT GCAGOGATGO AGAGACCCAA QCCAGAGGAQ AAGCGGTTTC 
GCCCATGCCC TTCTTTTACC TCAA6T6TTG TAAAATTOGC TACTGCAATT 
AOCTATGAAC TCATCAGTCT TCAAAGAATA 1GCTGGGA6C ATSGGTCAGA 
GCTGTG6CTG GCCATCCTCC TGCTGCTGGC CTCCATTGCA GOC33GCCTCA 
AGCCAOGGGA CTGCCACAGA CTGAGCCTTC OGGAGCATGG ACTOSCTCCA 
ACCTGTTGCA TTAAACTTGT TTTCTGTT6A TTACXTTCTTG GTTTGACTTC 
GGGAXG66A6 AGTQGGGATC AGGTGCAGTT GGCTCTTAAC CCTCAAGGGT 
ACATTCA6A0 GAAOTCCAGA TCTCCTGAST AGT6ATTTT6 GT6ACMGTT 
AAATCAAACC TTOTAACTCA TTTATTGCTS ATOGOCACTC TTTTOCTTGA 
CCTCTGAtSGG CTTCAGTATT GATGGGGAGG GAGGCCTAAG TACCACTCAT 
TGCTGAGATG CTTCOGACCT TTCAGGTGAC GCAQ6AACAC T6GGGGAGTC 
GGGT6AAGAC ATCCCTGGA6 TGAAGGACTC CTCA6CATGG GGGGCAGIGG 
AGGGCTGCCC CCATTCCAGT GG20QAG8CG CTGTGGAKGG CIGCTTTTOC 
CTACCAGATT CCAGGAGGCA GAAGATAACT AATTGTGTTG AAGAAACTTA 
ACCAGCTGGC ACAGGTGCAC AGATTCATAA ATTCCCACAC GTGTGTGTTC 
ACTTAGGCCA A6TAGAGAGC ATCAGGGTAA ATGGCGTTCA TTTCTCTGTT 
CATCCAT6GG GAGCTGAGAA ATCAGACTCA AAGTTCCAOC AAAAACAAAT 
TTCAAAAOTT CAOQAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

8eq ZD HOt B3 Protein sequence: 
Protein Accession ftt AAK01391 



1 11 21 31 41 51 

I.I I I I I 

MALLALIiLW ALFHVWTDAN LTARQRDPED SQRTDEGDKR VWCHVCKSBS TFEOQNPRRC 
KWTEPyCVZA AVKZFPRPFM VAKQCSAGCA AMERPKPBEK RFLLKEPMPP PYLKOCaCXRY 
Q7LEGPPIKS SVFKEYAGSN GESCGGLHLA ZLLLLASZAA GLSLS 

Seq ID NO I B4 DNA sequence 

Kucleic Acid Accession #: IIN_022B93.1 

Coding sequence: 229-2726 " 



PCTAJS02/12476 



GTCAIGTTTG 
CAGAGOCATA 
AGCAGTGCTC 
TCCTGGAAGA 
TAGAGGGGCC 
GC C GI WI Q B 
GCCT6TCTTG 
6ACCGTTGTC 
CCAGGGTCTT 
TCTTTAACTC 

CTCCCCrCTG 
GGAGAGTATG 
TGAATGATTG 
G6CACA0GTT 
TCAACCTTTC 
GACTTCACCC 
AACATCTGAA 
AAGATGCAGC 
ACAAGGGGAC 



TTTTTTTTTT 
TGC6CCATCT 
TTTTCTCTGG 

AAGCAA6GCA 
ATTCTTACAG 
CTCCTCACCT 
GAGCACAAAC 
CCTTCCCCTT 
GTCACGCCAG 
GAACACATAG 
GGAGCTCTAA 
GAT6AGCCCA 
CTCTTGCAAC 
AGTCCCCTGA 
CCACCTCTCC 
6GATCAGTAT 
CTGTTTAGTC 
6AAGAAATGG 
CCAA TOGCTA 
AACA08TCTA 
CCATTGCAGC 
TCCGCCCCTC 
ACGTTCAAAT 
TACAAGTGCA 
AA6A06CACA 
6CCAGCTCCC 
TC O GTQ G TGG 
GAGGAGGAAG 
CTGACGGAGA 
CA06AGAACA 
6A0GTCATGC 
GTCCTGGQCQ 
TGCQAC3GAAG 
CGCGGCIGCT 
A0CC0CA6CT 
CCCCOQGCCA 
GCCrcCAQGC 
6CCTCCTCX3T 
GA6CTG6A0G 
ATTAGTGGTC 
GAGTACTGTG 
ACGGGOGAAA 
CTCACCAGGC 
TGTAAGATGC 
GATCGAGTGT 
CTOCCACCTG 
CCTGTAGGAT 
A06AAGCTAA 
TTCTTTTTTC 



11 

I 

TTTTTTOCTT 
TTOTATTATT 
AGTCTCCTTC 
CGCCX5CCGCC 
AACCCCAGCA 
ATGAT6AAGC 
GTGGGCAGTG 
GGAAACAATG 
CACCAATCGA 
AAGATGAOGA 
CAGATAAACT 
TCCCCAOQCC 
GCAGCTACAC 
AOGCACAGAA 
CCCCGCGGGT 
ATOGGATTCA 
OGAGAGAQGC 
CACCACCGAG 
CCCTGGCCAC 
TGGAGCCTCC 
6CCCAC0QCT 
CAG6TAGCAA 
CTCCCTCCCA 
TTCAGAGCAA 
ACCTGTGCGA 
TGCACAAATC 
06GAACC0G6 
CCAAGTTCAA 
AGGAGGAOSA 
GGGAGAGGOr 
0CTG60QGG0 
AGG6CATGGT 
AGAAGCATAA 
ACTCGGTGGC 
CCCC66G06A 
OGCTGAGCCC 
OGATGCCCAA 
AGCTCAAAGA 
C3GGAGCACTC 
GAGGGATCTC 
OGGGCACGGG 
GGAAA6TCTT 
GGCCTTATAA 
ACATGAAAAC 
CTTTTAGGGT 
TX2AATAATGA 
ACACCCOCTT 
TTTTTTCTAG 
GAATATGAGA 
TTTTTCCTTT 



CCTGGTGGTG 
CCACG0GT6C 
GTCOOOCATG 
CACCAGOSAC 
GA006AQAAC 
CGA6GAAGAG 
GGACTAOSGC 
OGOGGTOGTG 
6CTCAGCTCC 
GCGOGGCCAC 
CGGOGAGTCG 
GTOGGCCTCG 
CTTCTCTAAG 
CACGGAGAAC 
TCCCTTCCTT 
CTCGGAGAAC 
QGGGOSCAGC 
CAGGCCCAGC 
CAAGAACTGT 
ATGCGAGCTG 
GCAT6GCCAG 
GTACAGTACC 
TATAAAAACT 
TTTCACCACT 
TCCX31TGTGA 
GTGCTTGTCA 
TTTTTTTTTT 



31 
I 

CAT6A0GGCT 
TTTG6ATGTC 

GGCTCTCCOG 

CGGGAATTCT 
COGTTGGSAO 
TTCCCATTGG 
CTCTGCTTAG 
GCATCCAATC 
ACGTCATCTA 
AGG6QCCTCT 
GCA6AATATG 
TGCAAACAGC 
TTAAGAATCT 
TCA6GACTA0 
AATAACCCCT 
GCAGAA6GGC 
GACCCCCACC 
AGTGCCTTTG 
TTCTCTAGGA 
08GCCCAGGC 
CTGGOGAOGC 
AAGTCCAAGT 
CACCGGOGCA 
ACCCAGGCCA 
AC6GTCAAGT 
TTG6TQG0CA 
GACCCCAACC 
GAAGAAGAGG 
TTGGGGCTGA 
60OGTGGGC3Q 
AT6CAGCACT 
CTGGOOGAGG 
GACOSCATAG 
GG6GGCCTGT 
GGCATCAAGC 
GTGTACTGGC 
A6CTT0GGA0 
GGGAGCTTGC 
G6CA06GGAA 
TCAAAAGM3G 
AGCAATCTCA 
TQCAACTATO 
GTGG6QAA0G 
CTG6AGAAAC 
GAATAGAGGT 
CCCTTTCCOC 
TTTAAACAAA 
CCAGCACACC 
TCCTTTATGT 



41 
I 

CTCCCACAAT 
AAAAO GCACT 

AT6TGAACCX3 
AGCCCACC3VT 
0GCCX3GAGCC 
CTCCAGAAGO 
GGGACATTCT 
AAAAAGCTGT 
COGTGGAGGT 
GAAQAATTTG 
CCTCCCCTOG 
0CC06CAGGG 
CATTCACCAG 
ACTTAGAAAG 
GTGCAGAATG 
TTAAGCT6CT 
GCTTTCCACC 
6CATAGAGCG 
ACAG6GTGCT 
GACTTAGAGA 
CTATGCAAAO 
CCCCCCTCCC 
CATGCGAGTT 
GCCACACGGG 
GCAAGCTGAA 
OOGAOQAGGO 
606CCA6C!AG 
TGATCCOGGA 
AG6AAGAGGA 
GCCTGGAGGC 
AGQAOAGOOS 
TCA60GAGGC 
CXZGAGGGCCA 
ACGATGGCAC 
CCAAAAAGCT 
TaSAOAAQGA 
AGTGQCT08C 
ACTCCAGACA 
GCTTCTCCAC 
GT0GAG6GAG 
GCAGA CSCaS 
CTGTCCACAO 
CCTGTGCCCA 
AGGTTTACAA 
ACATGAAAAA 
ATATTAATAC 
ATCGCCCTCC 
CAAACAAACA 
TGTTTTTTTT 
TCTCACOGTT 



51 
I 

TCATCTTCOC 
GATQAAOATA 
AGC0GTCX3TC 
GTCT06C0GC 
TCTTGAAGCC 
G6ATCATGAC 
TATTTTTATC 
GGATAAGCCA 
TGGCATCCAG 
CCCCAAACAG 
TTCTGCACAT 
TATTTGTAAA 
TGCATGGTTT 
CGAACACGGA 
TCCTTCCCAC 
AAGAATACCA 
CACTCCCCCC 
CCTGGGGGOG 
GOGGTTGAAT 
GCTG6CAGGG 
GTTACTGCAA 
TCCTCTGCAA 
CTGCX3GCAAG 
CGAGAAGCCC 
GCX3CCACATG 
TCTCTCCACC 
0606CTCAA6 
GAACGGGGAC 
GGAGGAGGAG 
GGOQOGCCAC 
0GC0CT6CCC 
CTTCCACCAG 
CAGGGACACT 
TGTTAATGGC 
GCTGCTGGGC 
GTTOQAOCTO 
OOOCTAOGCG 
ATOQCCTTTT 
ACOQCCOGGG 
CA0300CCAT 
OGACACTTGT 
GAGAAGCCAC 
GAGTAGCAAG 
ATGTGAAATT 
ATGGCACAGT 
OCCTCOCTCA 
AGCCCCACTC 
AACAOAAGTA 

TGAAIGCATG 



160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2680 
2940 
3000 



220 
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ATCTGTATGG GGCAATACTA TTGCATTTTA OGCAAACTTT GAGCCTTTCT CTTGTGCSUIT 3060 

AATTTACATG TTGTGTATGT TWATITA A ACTTAGACAG CATGTATG6T ATGTTATG6C 3120 

TATTTTAAAT TGTCCCTAAT TOGTTGCTGA GCAAACATGT TCCTGTTTCC AOTTCOGTTC 3180 

TGAGAC5AAAA AGAGAGAGAQ AGAGAAAAAG ACCATGCTGC ATACATTCTG TAATACATAT 3240 

CATGTACAGT TTTATTTTAT AAOSTCAGGA GGAAAAACAG TCTTTGGATT AACCCTCTAT 3300 

AGACAGAATA GATAGCACTQ AAAAAAAATC TCTATGAGCT AAATGTCTGT CTCTAAAGG6 3360 

TTAAATXn-AT OVATTGGAAA GGAA6AAAAA A06CCTTGAA TTQACAAATr AACAGAAAAA 3420 

CA(3AACAAGT TTATTCTATC ATTTGGTTTT AAAATATGAG TGCCTTGGAT CTATTAAAAC 3480 

CACATOGATG GTTCTTTCTA CTTGTTATAA ACTTGfTAGCT TAATTCAGCA TTGGGTC3AGG 3540 

TAATAAACCT TAGGAACTAG CATATAATTC TATATTGTAT T7CTCACAAC AATGGCTACC 3600 

TAAAAAGATQ AOOCATTATG TCCTAGTTAA TCATCATTTT TGCTTTAGTT 1AATTTTATA 3660 

AACAAAACTG ATTATACCAG TATAAAAGCT ACTTTGCTCC T6GTGAGAGC TTAAAA6AAA 3720 

TGGGCTGTTT TGCCCAAAGT TTTATTTTTT TTAAACAATG ATTAAATTGA ATGTGTAATO 3780 

roCAAAAGCC CTGQAACS3CA ATTAAATACA CTASTAAGGA GTTCATTTTA TGAAGAIATT 3840 

TGCTTTAATA ATGTCTTTTT AAAAATACTG GCACCAAAAG AAATAGATCC AGATCXACTT 3900 

GOTTGTCAAG TGGACAATCA AATQATAAAC TTTAAGACCT TGTATACCAT ATTGAAAGGA 3960 

AGAGGCTGAC AATAAGGTTT GACAGAGGGQ AACA6AAGAA AATAATATGA TTTATTAGCA 4020 

CAA0GTG6TA CTATTTGCCA TTTAAAACTA GAAiCAGGTAT ATAAGCTAAT ATTGATACAA 4080 

TGATGATTAA CTATGAATTC TTAAGACTT6 CATTTAAATG TGACATTCTT AAAAAAAGAA 4140 

GAGAAAGAAT TTTAAGACTA GCAGTATATA TGTCTOXGCT CCCTAAAAGT TGTACTTCAT 4200 

TTCTTTTCCA TACACTGTGT GCTATTTGTO TTAACATG6A AGAGGATTCA TTGTTTTTAT 4260 

TTTTATTTTT TTAATTTTTT CTTTTTTATT AA6CTAGCAT CTGCCCCAGT TGGTGTTCAA 4320 

ATAGCACTTC ACTCTGCCPG TGATATCTGT ATCTTTTCTC TAATCAGAtSA TACAGAGGTT 4380 

GAGTATAAAA TAAACCTGCT CAGATAGGAC AATTAAGT6C ACTGTACAAT TTTCCCAGTT 4440 

TACA6GTCTA TACTTAAGGG AAAASTTGCA AGAATGCTGA AAAAAAATTG AACACAATCT 4500 

CATTGAGGAG CATTTTTTAA AAACTAAAAA AAAAAAAACT TTGCCAGCCA TTTACTTGAC 4560 

7ATT6AGCTT ACTTACTTGG ACGCAACATT GCAA60GCIG TGAATGGAAA CAGAATAC3VC 4620 

TTAAOkTAGA AATGAATGAT T6CTTT06CT TCTACAGTGC AAGGATTTTT TT6TACAAAA 4680 

CTTTTTTAAA TATAAATGTT AA6AAAAATT ITTTTTAAAA AACACTTCAT TATGTTTAG8 4740 

GG6GAACTGC ATTTTAGGGT TCCATTGTCT TGGTGGTGTT ACAAGACTTG TTATCCATTT 4800 

AAAAATGGTA GTGGAAATTC TATGCXrTTGG ATACACACOG CTCTTCAGGT TOTAAAAAAA 4860 

AAAAACATAC ATT6GGGAAA GG7TTAAGAT TATATAGTAC TTAAATATAG GAAAAT6CAC 4920 

ACTCATGTTG ATTCCXATGC TAAAATACAT TTATGGTCTT TTTTCTGIAT TTCTAGAAT6 4980 

OTATTTSAAT 7AAATQTTCA TCTAGTGTTA GGCACTATAO TATTTATATT GAA6CTTGTA 5040 

TTTTTAACTG TTGCTTGTTC TCTTAAAAGG TATCAATGTA CCTTTTTTGG TAGTGGAAAA 5100 

AAAAAAGACA GGCTGCCACA GTATATTTTT TTAATTTGGC AGGATAATAT AGTGCAAATT 5160 

ATTTGTAT6C TTCAAAAAAA AAAAAAA6AG A6AAACAAAA AA6TGTGACA TTACAGAXGA 5220 

GAAGCCATAT AATGGGGGTT TGGGG6AGCC TGCTAGAATG TCACATGGAT OGCTGTCAtA 52 80 

GGGQTTgT A C ATATOCTTTT ■ nWAXX. ' m TTCCTGCT8C CATACTGTAT GCAGTACTGC 5340 

AAGCTAATAA CGTTGGTTTG TTATGTAGTG TGCTTTTTGT CCCTTTCCTT CTATCACCCT 5400 

ACATTCCAGC ATCTTACCTT CATATGCAGT AAAAGAAAGA AAGAAAAAAA AAQGAAAAAA 5460 

AAAAAAAAAC CAATGTTTTG CA6TTTTTTT C31TTGCX:AAA AACTAAATGG TGCTTTATAT 5520 

rrAGATTGGA AAGAATTTCA TATOCAAAGC ATATTAAAGA GAAA6C00GC TTTA6TCAAT 5580 

ACTTTTTTGT AAATGGCAAT 6CAGAATATT T 'i'G'iT A 'iTG G CCTTTTCTAT TCCT6TAAT0 5640 

AAAGCTGTTT GTC6TAACTT GAAATTTTAT CTTTTACTAT GGGAGTCACT ATTTATTATT 5700 

GCTTATGTGC CCTGTTCAAA ACAGAGGCAC TTAATTTGAT CTrTTATTTT TCTTTGTTTT 5760 

TATTTTTTTT TTTATTTAGA TGAOCAAAGG TCATTACAAC CTGGCTTTTT ATT6TATTTG 5820 

rn v i wr cr ttgtiaagtt ctattgoaaa aaccactgtc tststttttt togcaottgt ssbo 

CT6CATTAAC CTGTTCATAC ACCCATTTTG TCCCTTTATT GAAAAAATAA AAAAAATTAA 5940 
A 

Seq ID NO: 85 Protein sequence: 
Profcain Accession #> NP_075044.l 

1 11 21 31 41 51 

I I I 1 I I 

MSRRKQ6KPQ HIiSKREFSPB PLEAZLTDDE PDH6PLGAFE GDHDLIiTOSQ OQHHFPLGDI 60 

LIFXERiaUCQ QYGSLCLBKA VDKFPSPSPZ EMKKASHFVB VOIQfVTPBDD DOiSTSSRRX 120 

CPKQBKZADK LLBHRGLSSP RSAKGALZPT PGHSABYAPQ GICKDEPSSY TCTTCKQPPT 180 

SAWFLLQHAO NTHGLRIYLE SEHGSPIiTPR VGIPSQLGAE CPSQPPLHGI RIASKIIPFNL 240 

LRIPGSVSRE ASGLABGRPP PTFPLFSPPP RHHLDPHRIE RLGAEEMALA THHPSAFDRV 300 

LHLNPMAMEP PAKDPSRRLR ELAGNTSSPP LSPGRPSPNQ RUiQPFQPGS KPPFXATPPL 360 

PPIiQSAPPPS QPFVXSKSCB FOGKTFKFQS HLWHRKSBT GSKPYKOn^C DHACTQASKL 420 

KRRMRIEMRK SSPMTVKSDD GLSTASSPEP GTSDLVGSAS SALKSWAKF XSENDPEJLIP 480 

EKGDEEEEB7 DESEEEEBEE EEESLTESER VDYGPGLSLE AAREHEKSSR GAWGVGDES 540 

RALPDVKQGM VLSSMQHFSE AFHQVLGEKH KRGHLAEAEG HRDTCDEDSV AGESDRIDDG 600 

TVNOtGCSPG ESASGGLSXK LLLGSPSSLS PPSKRZKLEK EFDLPPATMP NTEtlVySQWL 660 

AGYAASHQUC DPPLSFGDSR QSPFA£S8EH 8SBNGSIAFS TPPGELDGGZ SG R SGT G SGG 720 

STPBZSGPGT 6RPS8KB6RR SDTCEYCCaCV ihCNCSNLTVB RHSBIGBRPY IKaSI^nACA 780 
QS8KLTRHMK TBGQVGKDVY KCBICRMPFS VYSTLEKHMK lOmSDRVLNH DZKTB 

Seq ID KO: 86 DNA sequence 

Nucleic Acid Accession $t XM_035292.2 

Coding sequence: 53-1576 

1 11 21 31 41 51 

I I 1 I I ) 

GCTCGCTGGG CCGCGGCTCC CGGGTGTCCC AG6CC0QGCC GOTGOGCAQA GCATGQC6GG 60 

TGOGG6CCGS AAG06G06OG CGCTA6CGGC GC0GGO6GCC GAGGAGAAGQ AAGA6GCX3CO 120 

GGAGAAGAT6 CTGGCOSOCA A6AG0GCGGA CGGCTC3GG0G CCGGCAGGOG AGG60GAGGG 180 

CGTGAOCCTQ CAGCGGAACA TCAOGCTGCT CAAOGGOGTG GCCATCATOO TGGGGACCAT 240 

TAT03GCT0Q GGCATCTTOO TGAOGCCCAC GGGOGTGCTC AAGGAGGCAO GCTOGCCGGG 300 

GCTGGOGCTO GT G GTGTGGG CCGOGTGOGG CGTCTTCTCC ATCGTGGGCQ OGCTCTGCTA 360 

C360GGAGCTC GGCAOCAOCA TCTOCAAATC GGGOGGOGAC TA06CCTACA TGCIGGAG&T 420 

CTAOGGCTOg CT6C0060CT TOCTCAAGCT CTGGAT0QA6 CTGCTCATCA TGCOGCCTTC 4B0 

ATOGCAGTAC ATOGTGGOCC TGGTCTT06C CA0CTACCT6 CTCAAGOOGC TCTTCCOCAC 540 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



CTGC00SGT6 
GGC06TGAA€ 
CAAGCTCCTG 
TGTGTCCAAT 
TGTGCTGGCA 
OiCAOAOGAA 
CATC6T6IV0G 
GCAGATGCTG 
GTCCTGGATC 
GTTCACIVTOC 
CTCCftTGATC 
GAC6CTGCTC 
CAACTGGCTC 
TGAGCTTGAG 
CCTCTTCCTG 
CATCATCCTC 
GTGGCTCCTC 
CCCCCAGGAG 



CCGGAGGAOG 
TGCTACAGOG 
GCCCTGGCCC 
CTAGATCCCA 
TTATACAGOG 
ATCATCAACC 
CTGGTGTACG 
TOGTCCXSAGG 
ATCCXXX3TCT 
TCCAGGCTCT 
CACCCACAGC 
TACGCCTTCT 
TGCGTGGCCC 
OGGCCCATCA 
ATCGCCGrrCT 
AGOSGGCTGC 
CAGGGCATCT 
ACATAGCCA6 



CAGCCAA6CT 
tXSAAGGCOGC 
TGATCATCCT 
ACTTCTCATT 
GCXTCm-GC 
CCTACAGAAA 
TGCTGACCAA 
CCGTGGCCGT 
TOGTGGGCCT 
TCITO3TG66 
TOCTCACOOC 
CCAAGGACAT 
TGGCCATCAT 
AGGTGAACXrr 
CCTTCTG6AA 
COGTCTACTT 
TCTCCACSAC 
GAG6C06AGT 



CGTGGCCTGC 
CACCOGGGTC 
GCTGGGCTTC 
TGAAGGCACC 
CTAT GGAOGA 
OCTGCOCCTCS 
CCTOGCCTAC 
6GACTT0QGG 
GTCCTGCTTC 
6TCC066GAA 
CGTGCGGTCC 
CTTCTCC6TC 
CGGCATGATC 
GGCCCTGCCT 
GACACC06TG 
CTTOGGGGTC 
OGTCCTGTGT 
GGCTGCCGGA 



Seq ID HOt 87 Protein sequence i 
Protein Accession #i XP_035292.2 



MA6AGPKRRA 
GTIIGSGIFV 
LBVYGSLPAF 
LLTAVNOrSV 
CanVLALYSQ 
STEQMLSSEA 
SILSMIHPQL 
RKPRTiRRPIK 
KPKHXiLQGIF 



11 
I 

LAAPAAECKE 
TPTGVLKEAG 
LKLWIEUill 
KAATRVQDAF 
LFAYGGHNYL 
VAVDPQryHL 
LTPVPSLVPT 
VKLALPVFFZ 
STTVLGQKLM 



21 

1 

EAREKMLAAK 
SPGLALWWA 
RPSSQYIVAL 
AAAKLLAIiAL 
NFVTE0«NP 
GVMSWIIPVP 
CVMTLLYAFS 
LACLPLXAVS 
QWPQBT 



31 
I 

SADGSAPAGE 
ACGVFSIVGA 
VFATYLLKPL 
IZLLGFVQZG 
YRHLPLAIZI 
VGLSCFGSVN 
KDIPSVINPP 
FHKTPVBOGZ 



CTCT6GGTGC 
CAGGAT6CCT 
GTCCAGATOG 
AAACTGGATC 
TGtSUlTTACT 
GCCATCATCA 
TTCAGCACCC 
AACTATCACC 
GGCrcOGTCA 
GGCCACCTGC 
CTOnXSTTCA 
ATCAACTTCT 
TGGCTGCGCC 
GTGTTCTTCA 
GAGTGTGGCA 
T6GTG6AAAA 
CAGAAGCTCA 
G6AGCATGC 



41 
I 

6BGVTW3RNI 
IiCYAELGTTI 
FPTCFVPEBA 
XGDVSNLDPN 
SIiPIVTLVyV 
GSLFT5SRLF 
SFEtlWLCVAL 
GFTZILSGliP 



TGCTGCTCAC 
TT6006C08C 
GGAAG6GTGA 
TGGGGAACAT 
TGAATTTCGT 
TCTOQCTGCC 
TGTOCACOGA 
TGGGOGTCAT 
ATGGGTCCCT 
CCTCCATCCT 
06TGTSTGAT 
TCAGCTTCTT 
ACAGAAA6CC 
TCXrrGGCCTG 
TGGGCTTCAC 
ACAAGOCCAA 
TGCAGGT06T 



51 
I 

TLUIGVAZIV 
SKSG<a)YAYM 
AKLVACLCVL 
FSFEGTKLSV 
LTNLAYFTTL 
FVGSREGHLP 
AIIGMIWLRH 
VYFFGVWKKN 



Seq ZD MO: 88 DNA sequence 

Nucleic Acid Accession #: NM_00526B.l 

Coding sequence: 168-989 



1 

1 

TAAAAAGCAA 
TCTGGATATG 
AGCCCTGAGG 
TCTTTGAGGQ 
TGTCTCTOGT 
6T6ATGACCA 
TTGATGAGTT 
CAT6CCCCTC 
ACXX3AGAA6C 
GTGGGCTCTG 
TTCTCTATGT 
AOQC AGATCC 
TTTTCACCCT 
TCATCTACCT 
TGT6CACAGG 
C66GTGACCT 
GAGAOCAIGT 
CCT6GATGG6 
CATGAGGTAG 
TCAACTCCAG 
GCTCGGTTTC 



11 

1 

AAGAATTCGC 
AAATTCAAGC 
AQTAGTCACT 
ACTCCTGAGT 
CTTCATCTTC 
CAAGGACTTC 
CTTCOCZtSTG 
ACTGCTC6TG 
CCATOGGGAG 
GTGGACATAT 
GTTCCACTCA 
ATGTOCXIAAT 
CTTCATOGTG 
GGTGAQCAAG 
TCATCACCCC 
CATCTTTCTG 
GAAGAAAAOC 
GAGGCTCTAG 
GGGCAGGCAA 
CCACCTGCCC 
CTTTTCTAGA 



21 
I 

GGCCGOGTCG 
TGCTTGCTGA 
CAQTAGCAGC 
GOGGTCAACA 
OGCGTGCTGG 
GACTGCAATA 
TCCCATGTGC 
GTGAT6CA08 
AACAGTGG6C 
GTCTGOWSOC 
TTCTACCCCA 
ATAGTGQACT 
GCCAOUGCTG 
AGATGCC3^CX3 
CAC6GTACCA 
G6CTCAGACA 
ATCTTGT6AS 
CATCTCTCAT 
GAGAGAGGAT 
CA6CTOGA06 
ATGGAAATA6 



31 
I 

ACACGGGCTT 
OTCC7ATTGC 
TGA06CGTGG 
AGTACTCCAC 
TGTACCTGGT 
CTCGCCAGCC 
GGCTCTGGGC 
TG6CCTAC0Q 
GCCTCTACCT 
TAGTGTTCAA 
AATATATCCT 
GCTTCATCTC 
OCATCTGCAT 
AGTGCCT06C 
CCTCTTCCTG 
6TCATCCTCC 
GGGCTGCCTB 
A66TGCAACC 
TCAGAOGCTC 
GCACTGGGCC 
TGAGGGCCAA 



41 
I 

CCCCGAAAAC 
OOGCTGCTGG 
GTCCACCATG 
AGCCTTTGGG 
GACGGCCGAG 
CGGCTGCTCC 
CCTGCA6CTT 
OGAGGTTCAG 
GAACCCOGGC 
GGCGAGCGTG 
CCCTCCTGTG 
CAAGCCCrCA 
CCTGCTCAAC 
AGCAAGGAAA 
CAAACAAGAC 
TCTCTTACCA 
GACTGGTCTS 
TGAGAGT6G0 
TGGGA6CCA6 
AGTTCCCCCT 
TGC 



51 

1 

CTTCCCCGCT 
GAGCCAGGAG 
AACTOGAGTA 
OGCATCTGGC 
06TGTGTG6A 
AACGTCTOCT 
ATCCTGGTGA 
GAGAAGAGGC 
AAGAAGCGGG 
GACATCGCCT 
GTCAAGT6CC 
GAGAA6AACA 
CTCGT6GAGC 
GCTCAAGCCA 
GACCTCCTTT 
GACCGCCCCC 
0CA6GTTG66 
GGA6CIAAGC 
TTCCTAGTOC 
CT6CTCTGCA, 



Seq ID HO: 89 Protein sequence i 
Protein Accession #s MP_005259.1 

11 



51 



21 31 41 

) I I I I I 

MNHSIFEGLL SGVKKYSTAF GRIWIiSLVFZ FRVLVYLVTA ERVWSDDHKD FDCNTRQPGC 
SNVCFDEFPP VSHVRLWALQ LILVTCPSLL WMHVAYREV QEKREREAHG EKSGRLYU7P 
GiOCRGGIiWMT YVCSLVFKAS VDIAFLYVPH SFYPICYIUP WKCEADPCP NIVDCFISKP 
SEKNZFTLFM VATAAICILti HLVELZYLVS KRCHECLAAR KAQAMCT6BH PRGTTSSCKQ 
DDXiLSGDLZF USSDSHPPUi FDSPRDHVKK TIIi 



Seq ID NOt 90 DSIA sequence 

Nucleic Acid Accession S: l]M_002391.l 

Coding sequence: 26-457 



1 
1 

GGGGCGAAGC 
OGCCCTGCTG 
CCCGGGGABC 
0GG06T6G6T 
GC CCTGCA AC 
TGCGTGTGAT 



31 



41 



51 



600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



60 
120 
180 
240 
300 
360 
420 
480 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



60 
120 
180 
240 



11 21 

I I i 1 I 

AG060GGGCA GOGAGATGCA GCACOGAGGC TTCCTCCTCC TCACCCTCCT 
GCGCTCACCT CCGCGGTC6C CAAAAAGAAA GATAAGGTGA AGAAGGGOSG 
GAGTGCGCTG AGTGGGCCTG GGG6CCCTGC ACOOGCAGCA GCAAG6ATTG 
TTCCGOGAGG GCACCTGC6G GGCCCAGACC CA6GQCATCC G6TGOGGGT 
TG6AAGAAGG AOTTrGGAGC CGACTGCAAG TACAASTTTG AGAACTGGGG 
GGGGQCACAO GCAiCCAAAST COGCCAAGGC ACCCTGAAGA AG6C6C6CTA 



60 
120 
160 
240 
300 
360 



222 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



TGCACCGCCA AGAOCAAAGC 
CCAAGCCTG6 ATQOCAAGGA 
CCCAGGCC06 AGATGTGACC 
TGCCCTGCCT T G TCCCrCTC 
ACAAGGGATT CTGGGAAGCT 
TrenCTTOC CCACAATTOC 
CAATAAAAGC TCTTCTTTTT 



WO 02/086443 

OUITGCTCAG TGOCAGGAtSA 0C3VTC000ST OOCAAGCCC 
AAASGCCAAA GCCAAGAAAG 66AAGGGAAA GGACTA6A0Q 
GCCOCTQG TQ TCACATGGGG CCTCGCCAOO COCTCCCTCT 
CACCAGTOCC TTCTGTCTGC TOGTTAGCTT TAATCAATCA 
ACTOO0CA6C CCCACCCCTA AGTGCCCAAA GTGGGGAGGG 
TCAGOCrOCC CCAAAGCAAT 6TGAGTC0CA GA6C0G6CTT 
ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC 
TAATAT 

Seq ID NO: 91 Protein sequence: 
Protein Accession ft: NP_0023B2.1 

1 11 21 31 41 SI 

I i 1 { I I 

MQHR6FLLLT LLALIiALTSA VAKKXDKVKK GGPGS£CAEW AKGPCTPSSK DCGVGFRE6T 
GGAQTQRXRC HVPOmKKEP GADOCyRFEN NGACDGGTOT KVRQGnilQCA RyNAQCQSTZ 
RVTKPCTPKT KAKAKAKKGK GKD 



PCT/US02/12476 



Seq ID NO I 92 DNA sequence 

NUcleic Acid Accession #: KN_005130.1 

coding sequence: 98-802 



1 
I 

CTCTACCTGA 
OGTGTGCTCA 
GCTCTCCTTC 
GAATGGACTT 
TAAGCAGAAA 
CAGAT6GGCT 
GGACCAT6AA 
TGAGAGA8TC 
ATATTOCAAG 
TAAGCTA6TC 
GTCCCCCAGG 
GACCATGGCC 
GACTGC CCTO 
AOTGCAOGAC 
TGT06TAAGT 
TGTGCTTAGT 
TGGAATTTGC 
TTGCATGGCC 
GAGTOATAAT 
TTTTTCAAAA 



11 

1 

CACAGCTGCA 
GAACAAGGTG 
CTCCTACTGG 
CACAGCAAAG 
AGCAGGCCCG 
GCTACTGAGC 
TTTTOCTGTO 
7ATTGSAAAC 
ACTkGCTCTGA 
AGCTCCACTC 
GAGCACATCA 
ACXAAAGCTC 
QAGTTCTGTG 
ACGTCATGCT 
CCCTCT6TAT 
GAGTGCAACG 
CTTATTTTTC 
CACACAt^TA 
TTCAGT6CAA 
AAAAAAAAAA 



21 

1 

GCCTGCAATT 
AAC36CCCAGC 
CTGCTCAGGT 
TGGTCTCAGA 
GGAACAAAGG 
AGGAGGA6GG 
TCTTTGCTGG 
AAGTTGOC06 
AAACCAGAGT 
TATTTGGGAA 
AGGGCAAAGA 
COGAGTGTGT 
GaGAOACTTO 
AATGA6GTCA 
ACTTTAAA6C 
AAATATTTAA 
TTGGATGOQA 
T6TGTTTGA0 
OGAACTTTCT 
AAA 



31 
I 

CACTCrCACT 
TGCAGCCATG 
GCTCCTGGTG 
ACAAAAGGAC 
CAAGTTTGTC 
CATCTCTCTC 
CAATCCAACC 
GAATCTGCXSC 
6TCCAGAAA6 
CACAAAOCCC 
GACCACCCCC 
GGAGGACCCA 
GAOCTCTCTC 
AAA6A6AA0G 
TCTCTACAGT 
ACAAGTTTTG 
TGTTCAGAG6 
CAG06AAGAG 
OCTGAATTAA 



41 

I 

GOCTGGGATT 
AAGATCTGTA 
GA^i»GAAAA 
ACTCTGGGCA 
ACCAAAGACC 
AAGGTTGAGT 
TCATGCCTAA 
TCACAGAAAG 
GATTTTCCAG 
AGGAAGGAGA 
TCTAGCCTAG 
GATATGGCAA 
TGCACATTCT 
GGTTGCTTTA 
CCCCCCAAAA 
TATTTTTTGC 
CTGTTTCCTG 
TCTTTGAOCT 
TOGTAATAAA 



Seq ID NO: 93 Protein sequence: 
Protein Accession #: NP_005121.1 

11 



1 11 21 31 41 51 

I I I t I 1 

MKICSLTIiLS FLUiAAQVLL VEX^KKKVKNG LKSKWSEQK DTIiGNTQIKQ KSRPGNXGK? 
VTKDQANCRN AATGQEB6ZS LKVECTQLDH EFSCVFAGMP TS CLKLKD ER VYWCQ VAHNL 
RSQKDICRYS KTAVXTRVCR KDFPESSLKL VSSTLFGMTK PRKBKTEHSP RERIXGKETT 
PSSLAVTQTM ATKAFECVED PDMANQRKTA LEPC6ETWSS LCTFFLSIVQ DTSC 

Seq ID NO; 94 DNA sequence 
Nucleic Acid Accession #: MM_012101 
Coding sequence: 125-1891 



1 

I 

CTCCTCACAG 
TGCCAGAAAG 
TGGGATGGAA 
COOGAGCCCQ 
TGCCAAGACC 
CCIGAA60CA 
CATCCAGTTT 
OSAAGGCAAG 
TACCTTTGCC 
G6TX3TCCATC 
CCTTTTTTCA 
CAAQCAGAA6 
CAAGCCCCAC 
CTTTGAGGCC 
CGAQATCTGC 
AGIGGAOGAG 
6CTCAA6ATC 
CAAGAGCTTC 
GGACCTGGAG 
TGTGGACCAA 
GGACAAGCAO 
ATTTGGTGCA 
GCTGGAGGGG 
ATGCATGCGC 
GAACCACATG 



11 
I 

GTGTGTCTCT 
GTCACCTATC 
GCTOCAGATQ 
TOGGGCCCCA 
ACCAAGGGGC 
GGGGAAGGTA 
GTC6A6TCC6 
AGGTCGCCGT 
GAAAAGG600 
ATGGA6CCCG 
OQGTOCAAGT 
GOG6TCAA6T 
CT6GAGGGCG 
CGCAAGTGTC 
ATCT6CTACC 
GOCAAGGCOG 
ATT6AGATTG 
ACCACCAATG 
AA6CAAAAGG 
GTGAAGGTGA 
ACXrOGGGAGC 
TT6ATGAGCA 
GAGGGCCTGG 
CAOGTTGAGA 
GAGAA0G6TG 



21 

1 

AGTCCTCGTG 
CT6AACCCCA 
CCTCCAGGAG 
GTGGCAGCCT 
ACG6CGGGGA 
GGAGOGCOCT 
GGGAOGACAA 
ACGCAGG6CT 
ACGTGOGCAA 
GGGA GAOCO G 
C06GCT00GA 



COGCCTTCCG 
CCX3TGCATGQ 
TTTGCATGTT 
AGAAGGAGAC 
AGGAT6AAGC 
AGAAGGCCAT 
A6GAAGTGAQ 
TCATGGATGC 
AGCTGCATAG 
ATTACTCTCT 
GACAGTCACT 
AGAT6TGCAA 
GT6AOCAT06 



31 
I 

GTT60CT6CC 
GCAAGCCTGA 
CAAC36GGTCG 
GGAGAATGGC 
GGCA6CTGA6 
6TTCG0G6GC 
6AACTCCAAC 
CCAGCTGGG6 
GTCCATTTTC 
G0G6AACAGC 
GGA08T6CIQ 
GTGCCAGGCC 
AGACCACCAG 
CAAGA06ATG 
CCAGGAGCAC 
GGAGCTGTCA 
TGAGAAGTG6 
CCTGGAGCAG 
GGCTGC6CTG 
TCTGGATGAG 
CATCAGOGAC 
CCCCCCACCC 
AGGCAACTTC 
GGCGGACCTG 
CTAIGT6AAC 



41 
I 

OGACTGCCTQ 
AACAGCTCAG 
AGCCCAGAAG 
ACCAAGGCTG 
GGCAAGAOCC 
AAT6AGT66C 
TACTTCAGCA 
GCTGCCAAGA 
TCGGAGTCCC 
TACCCCCGG6 
1QCGACTCCT 
TCCTTCTGOG 
CTGCTCGAGC 
GAOCTCTTCT 
AAGAATCATA 
CTGCAAAAGO 
GVGAAGGAGA 
AACTTCCGGG 
QAGCAGCGGG 
AGAGCCAAGG 
TCTGTGTTGT 
CTGCCCACCT 
AA6GACGACC 
AGCCGTAACT 
AACTACAOGA 



51 
1 

G06AGA0GCC 
CCAAGCACCC 
CCAGGGATGC 
ACG6CAA0GA 
TGGGCAGOGC 
GOOGAOCCAT 
TGGACTCTAT 
AGCCACCCGT 
GGAAGCCCAC 
CCGACA0G6G 
QCATOGGCAA 
AOCTGCATCT 
CCATCCGGGA 
GCCAGACCGA 
GGAOOGTGAC 
AGCAGCIGCA 
AG6AC06CAT 
ACCTGGTGCG 
AGCAGGATGC 
1X3CTGCATGA 
TTCTGCAGQA 
ATCATGTCCT 
TOCTCAATGT 
TCATTGAGAG 
ACAGCTTC6G 



420 
480 

S40 
600 
660 
720 
780 



60 
120 



51 
I 

6CACIGQATC 
GOCTCACCCT 
AAAAAGTGAA 
ACACCCAGAT 
AA6CCAACTG 
GCACTCAATT 
AGCTCAAGGA 
ACATCTGTAG 
AATCCAGTCT 
AAACAGAGAT 
CAGT6ACCCA 
ACCAGAGGAA 
TOCTCAGGAT 
AGAQAT GTCA 
TAT6AACTTT 
TTTTGTGTTT 
CAGCATGTAT 
GAATGA6CCA 
ACTCTGGQTQ 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



60 
120 
180 



60 
120 
160 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
640 
900 
960 
1020 

loeo 

1140 
1200 
1260 
1320 
1360 
1440 
ISOO 



223 



5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 

85 



WO 02/086443 

AGICCACGGG 
ACATCATACC 
AACAATCTCr 
ATTCAGAACT 
GGCTATCCCr 
GGCAAGCAGA 
GOGTCCAAOQ 
TCXrrCCTGAC 
CCTGGTCCTG 
CTCCGACTTC 
TGGTCACCAT 
CCTGCCCTAA 
COGCATOGTA 
TTCTCTCCAA 
ATCTOCCATT 
TOGTOCTACC 
AAACXr rCTC A 
TGAGrCTGTO 
ATTCCTCTCT 
CAAATAGCTA 
ACTTTGGCAT 
CACT6CCCCC 
CCATGTT6CA 
TGGG6ACATC 
CTTGCA6GG0 
GGTCTGTC 



GGGTGAGTGG 
TGGGGTCOGG 
GAAGAATTTC 
CTCCTCCAGC 
CTCCCTGAAA 
TTGGAAATCT 
CAA0606ATT 
CCCCTGCTCT 
TGGGAGGGAG 
GTTCOGGCCT 
TGACCTCAGA 
TA66TTG0GG 
CA6TGAGTAC 
CTATAGACGT 
ACAGCCACCC 
GTGCTCTCTC 
CT6CAGATG6 
ACAGCCACn 
TAGCCAAGAT 
TCCACCCATG 
TTCAGTCTAC 
GTeCCTTACA 
TGATTACOOC 
AGCAGCACAG 
GATGCjGCCAG 
ATAAACXATT 



AC ACCATGAA 
AGCCCTOGTC 
ATGGCACCAA 
CTGACAATGA 
CCCTCATG06 
CIATGCTGTC 
AA6CCCCATG 
CCTGCTGCrC 
CACCTGCCCT 
CCCACTG6CC 
CATTOCTBXO 
CCCGCCAGCC 
TCAGCCTGCC 
GGCCCTATCC 
CACATGGCCC 
TATCAATGCC 
GTOTCTTGAC 
GTOCCTGGAfl 
TCCCTCTGCT 
CTCGCCCAGC 
TCTCTCTGGC 
ACCCrCAGCC 
TATCAGGGTG 
TOCC G TCTCA 
TTGGQG A GGG 



GAOATACTCC 
TUCrGGCOGC 
AGGTAACTAC 
CCTGCCCGTC 
6A6CCAAAGC 
TCACTAC066 
AGCTOCTGGC 
TTGCCTTCTA 
CTGCAGCCCr 
ACACTCCATT 
CIGA6A06CC 
TOCTOCTCTC 
TCTCOCGCCC 
COCAATGTTG 
AOCTCCTGCT 
CAGCAT66CA 
ATCACCCTAC 
GGTOC5CTTCT 
GAGATAAAGA 
TACCATTTAC 
GATGGAGTGT 
GTTGCCCCAT 
CTCAAGGATT 
ACAGCCCCAG 
AGACATCCAG 



ATGTACCTGA 
TTCA0CAAG6 
ACCTCCOGGG 
GTCCAAGGCA 
CCCAAGGCCC 
GCATTCTAOO 
GGAAGGAACG 
AGCTACTGTG 
CTGCCAGCCT 
CAGACTCCTT 
AACCCATCAC 
GGGCTG6ATC 
AOGCCCTGCT 
TCAGCAGATG 
TCCCAGAGGA 
GAAOC TGCAfl 
CCA660GGTG 
CCTGACTGGC 
ATTCXXTTAA 
CATTTGCCTA 
GGCTGGGCTG 
CAGAGGCTGC 
GGAGAGGAGA 
GCCTATGGGG 
CTTGG6CTTT 



CAOOCAAAGG 
AGACCAOCCA 
TCTGGGAGTA 
GCTCCrCCTT 
AGCCCCAGAC 
TCAACAAA66 
AGQOS CCACA 
CTTGTCTGGG 
CTTGGGGGCA 
TCCTGOCTTG 
AOGGGTmGA 
TOGGGGCTAG 
GTCTCCA6GC 
CCTGGACAGC 
CTGGCCCTAC 
TGGOCAAGGG 
GGTCTCCAOC 
AGGATGACCT 
CATGATATAA 
CAGAATTTCA 
A0C3QCAAAAG 
CTCCTCCTTC 
CAAAACCAGG 
GCTCTGGAAG 
CCCCTTTGGA 



Seq ID KOs 95 Protein sequence: 
Protein Accession 9* HP_03 6233.1 



1 
I 

MEAADASRSN 
KPGBGRSALF 
FAEKGDfVRKS 
QKAVKSCLVC 
TCZCnOIFQ 
SFmiEKAII) 
KQTREQLHSI 
MRHVEKMCKA 
VRTSYQPSSP 
LRGYPSLMRS 



11 
I 

GSSPEARDAR 
AGNEWRRPII 
IFSESRKPTV 
QASFCEiaLK 



21 
I 

SPSGPSGSXiS 
QFVSSGDDKN 
SIMEPGETRR 
PKLEGAAFRD 



EQNPRDLVRD 
SDSVLFI/QETF 
DL8RNFIESN 
GRFTKETTQK 
QSPKAQPQTW 



LSKQKEEVRA 
GALMSNYSLP 
HMENGGDHRY 
NFtSNLYGTKG 
KS(HCQTKXiSH 



31 
I 

NGTKADGKDA 
SNYFSMDSME 
NSYPRADTGL 
HQTiTiRPIRDF 
LSLQKBQLQL 
ALBQSBQBAV 
PPLPTYHVLL 
VNNYTNSFGG 
HYTSRVWBYS 

YRpFYvmaaf 



41 

! 

KTTMUUGGEA 
GKRSPYAGLQ 
PSRSKSGSHE 
EARKCPVHGK 
KIIEZEDEAE 
DQVKVZMDAL 
EGE6L6QSU3 
EWSAPDTWKR 
SSIQNSDZmL 
GIGSNEAP 



Seq ID NO: 96 DNA sequence 

nucleic Acid Accession S*. NM_OB0668.1 

Coding sequences 83-841 



GGCACGAGGG 
GAGCTOGAGA 
GOGCTCOGGG 
A6GCTCTGAA 
A6TCAGAAAG 
CC3VATCACCT 
TGGGAGGGAG 
CAQCACTCCT 
CAGAGACTTG 
CTCTGCCTCT 
GGCA6AAGAC 
G6TTTGTGCA 
GAAACAGAAA 
GGCTGCGGOC 
AGATGCA6TG 
CCTGTG6AQA 
GCTGOGCATG 
TTAGGAAATG 
CCTTCCTATC 
GAATCAGTGG 
CTTGTTQGGA 
CCCTCCTGAG 
TTCAGTTCCG 
6AGGGCAATT 
AAGGA TGTAG 
GTTTT^TAGT 
GCTGCTTGGA 
OCTCATGGCC 
TGCAGGGGTG 
A0G6AGTCTA 
CAACTATSCT 
ATAGCAATTT 
CTCATGATCT 
TTTGGATTTG 
TnYSATGTTT 
CAG AGAA TGC 
AGOCSGGOOC 



11 

1 

CAGCGAGTGG 
CGGAGCCTAG 
CCAAGGGOCC 
CTOCOQAGCA 
COCATOSTCT 
CGCAGGAGCC 
CTTACTAAGG 
6TGC0GAACC 
QAAATGTCTA 
ACCTOCACCC 
TT6TCCX3GA6 
AAGCCCTG6G 
GGTAAGAAGA 
ATGAATGCOG 
GGGGGTGCAC 
GGACACTTAG 
AGGACTGTCT 
GGGCC6CCTG 
TCCCCAAAGT 
CCCAOCCAGA 
GGGGTGGCTG 
TTGCCTTCTG 

CTGTCTTGGA 
GGAGOCTTAG 
TCTGGTCTGC 
GCAAAGGGTG 
TCTGGGCTGO 
QAAGGT6GCC 
CTTGGCCCTA 
IGTAAAGTCC 
TAGTTTTTGG 
CTGGAGAATT 
AAG6CTG00C 
A6AAGTT0ST 
CAOOQAAGAT 
tAATAAAACC 



21 

I 

CCTTCCOGGT 
TTATGTCTGG 
CATCTCCTAC 
TCCTCCCT6A 
TAAAGAGGAT 
CTAGGATTTC 
AGGACCTTTT 
CTGA66CGGA 
AGAAAGTCAG 
CAGGCCX3CC6 
TCTOOCCAGT 
CCCCAGACAT 
AGAAAATGCX: 
AGTTTGAAGC 
CTGGCCAGAC 
6GTCCCCTCC 
GCX7TTGAGG 
GCCCAGCCAC 
ACCATAGCCA 
AGTTAAAGGG 
CTTGGAAATA 




31 

t 

TGGCGOGOGC 
GAGGGGAAC6 
TAAGOCTCTG 
AATCTGGCGO 
OGTQGCCCAT 
CTTTTTCTTG 
CAAGACACAC 
GTGCM3CTCC 
GO G TTCCTAC 
GTCCTGCTTT 
GQTGTGCTCC 
GACTCTCCCT 
AGAGATCTTG 
TGC7GAGCAG 
TCTCCCTCCT 
CCTGGTCTTG 
GCTTGGGCAG 
TCACIGGTGT 
GTTTCCA6AT 
CTGAOGGTTO 
6GCCCAGGG6 
TCTTCTTGAA 
GTAAATAGTC 
OGACATTCAG 
GACXATAAGT 
TTTG6TAAAT 
TGT06CCACC 
AOGQCCCACG 
CCCATACCCA 
TTGAAGAGTC 
TCCTCGOGTA 
'CTCTCACATG 
TCTCTTCTTT 

6CTGAGGTGT 
TCAGGGTACT 
TCXtSQGAGTC 



41 

1 

CCGGG60GGC 
CGGT0C3GGAG 
0QGAG6TC0C 
AA6ACACCCA 
6CTGTAGAGG 
GA6AAAGAAA 
AGQGTCCCTG 
AAGGAAGGAG 
AGC0G6CTGG 
GGCTTCGAGG 
AAACTCACCQ 
GGAATCTCCC 
AAAAOGGAGC 
TTTGATCTCC 
GTCCTGTACA 
TTACCTGTGT 
CAGCG6CA0C 

ct i trfcivrf 

G6GCCACAGA 
AGGTGAGAGG 
CTCTGCCAQC 
CCCACCTGTG 
OCSUSAGAGAA 
CCTGTGGAGT 
GTGTACTACA 
GOCAGGTTGA 
AGGTGCTGTO 
CIGGAGICTT 
TTTCTTACAA 
CCAGACCTAC 
CCAGACA6GS 
AGAACACTGC 
CCATCGTGTQ 
CACCCCTGGC 
GCAGAGGAGC 
TQGATGAAAC 
CCAG6CCATC 



1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2620 
2680 
2940 
3000 



PCT/US02/12476 



51 

I 

AEGKSLGSAZ< 
LGAAKKPPVT 
VLCDSCIGNK 
TMELFCQTDQ 
KWQKEKDRIK 
DERAKVLHED 
NFKDDLUTVC 
YSMYLTPKGG 
FWQGSSSFS 



51 

1 

GGCGCTGGAG 
GAGC06CTCA 
AGGGGAAATC 
GTG06GCT6C 
TCCCA6CT6T 
AOGAGCXrOC 
CCACCCCCAC 
AGCTGGAOGC 
AGAOCXTOGG 
GGCTGCTGGG 
AGGTCXXSM 
CACCACCG6A 
TG6ATGA6TG 
TGGTTGAAT6 
TAGCCACCTC 
GTGTGCTGGT 
CATCTTGGTT 
GTC6TCCTGT 
CTGGGGAGGA 
CACCTCTGCT 
CTOQOCCTCT 
TAAA6AGGTT 
TT0GTGG6CT 
CTGAGTTTT6 
CAGAAGCTGT 
TAGGG06CTG 
AGTTTCTGTG 
A0CACICT6C 
AATAA6TTAC 
TAGCATTTTG 
GOQGQGGCTG 
CTQGATGCAT 
GATTCAATAG 
CATTGTACCT 
TGGTGGATAA 
GGTGCAGGCC 
TGCTCAA08C 



60 
120 
180 
240 
300 
360 
420 
480 
540 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 * 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 



224 



wo 02/086443 

TCXGTGGTTT GTCAGMXTG CAAGCAAGOC CCCT GCTGG G GAAGCCTAGG TGTCCTTGA6 2260 

CTCSAACOSCA CTGAAGAACT CTTGTCCTCA CT6GCTGATG CAGCAGAACT CTT6GGAAAT 2340 

GTCTTAGTCC TGCAGAATCA GGAGTCACCA GATGATGCAG AGTTGAGATC MXATTGCAA 2400 

AGTTCTCTGT TCCTGAGGAA CTAAATTTAA GGAAAAAATG GGATTTTGTT TTAGAGTTGG 2460 
AAAAAAAGCC TGATTAAAGA GTTTCimXT GTTAAAAAAA AAAAAAAAAA AAAAAA 

Seq ID NO: 97 Protein sequence: 
Protein Accession #: KP_542399.1 

1 11 21 

I I I 

NS6RRTRS6G AAQRSGPRAP SPTKPLRRSQ 
KRZVAHAVEV PAVQSPRRSP RZSPFLEKQ? 
EABSSSKBGE LDARDLSMSK KVRRSYSRLE 
SPWCSKLTB VPRVCAKPKA PDMTLPGISP 
FEAAEQFDLL VE 

Seq ID NOt 98 DNA sequence 
Nucleic Acid Accession #i Eos sequence 
Coding sequence: 58-12444 

1 11 21 31 41 51 

I I I I 1 I 

GGGGCATTTC OGOGTCCGGG CC6AGCGGGC GCACXKXSCX3G GAGCGGGACT CX^GOGGCATO 60 

GOSGGCTCOG GAGCCGGTGT GCGTTGCTCC CTGCTGCGGC TGCAGGAGAC CTTGTCCGCT 120 

GOGGACCGCT GOGGTGCTGC CCTGGCOGGT CATCAACTGA TCCGCGGCCT GG6GCAGGAA 160 

TGCX5TCCTGA GCAGCAGCCC CX^CGGTGCTG GCATTACAGA CATCTTTAGT TTTTTCCAGA 240 

GATTTOGGTT TGCTTGTATT TGTCOGGAAG TCACTCAACA GTATTGAATT TOGTGAATGT 300 

AGAGAAGAAA TCCTAAAGTT TTTATGTATT TTCTTAGAAA AAATGGGCCA GAAGATOGCA 360 

iXnTACrCTG TTGAAATTAA GAACACTTGT ACCAGTGTTT ATACAAAAGA TAGAGCTGCT 420 

AAATGTAAAA TTCCAGCCCT GGACCTTCTT ATTAAGTTAC TTCAGACTTT TAGAAGTTCT 480 

AGACTCATGG ATQAATTTAA AATTGGAGAA TTATTTAGTA AATTCTATGG AGAACTTGCA 540 

T70AAAAAAA AAATAOCAGA TACAGTTTTA GAAAAA6TAT ATGAGCTCCT A06ATTATT6 600 

GGIGAAGTTC ATCCTAGTGA GATGATAAAT AATGCAGAAA ACCTGTTCC6 CXSCTTTTCTG 660 

GGTGAACTTA AGACCCAGAT GACATCAGCA GTAAGAGAGC CXIAAACTACC TGTTCTGGCA 720 

GGATGTCTGA AGGGGTTGTC CTCACTTCTG TGCAACTTCA CTAAGTCCAT GGAAGAAGAT 780 

CCCCAGACTT CAAGGGAGAT TTTTAATTTT GTACTAAAGG CAATTOGTCC TCAGATTGAT 640 

CT6AAQAGAT ATGCTGTGCC CTCAOCTGGC TTG08CCTAT TTGCCCTGCA T6CATCTCAG 900 

TTTAGCACCT GOCTTCTGGA CAACTAC33TG TCTCTATTTG AAOTCTTGTT AAAGTGGTGT 960 

GOCCACACAA ATGTAGAATT GAAAAAAGCT GCACTTTCAG CCCTGGAATC CTTTCTGAAA 1020 

CAGGTTTCTA ATATGGTGGC GAAAAATGCA GAAATGCATA AAAATAAACT GCAGTACTTT 1080 

ATGGAGCAGT TTTATGGAAT CATCAGAAAT GTGGATTOGA ACAACAAGGA GTTATCTATT 1140 

GCTATCCGTG GATATGQACT TTTT6CAGGA CGGTGCAAGO TTATAAAOGC AAAAflATGTT 1200 

GACTTCATGT AOGTTGAiQCT CATTCAG06C TGCAAGCAGA TGITCCTCAC CCAGACAGAC 1260 

ACTGGTGAOG ACCGTGTTTA TCAGATGCCA AGCTTCCTCC AGTCTGTTGC AAGOGTCTTG 1320 

CTGTACCTTG ACACAGTTCC TGAGGTGTAT ACTCCAGTTC TGGAGCACCT 06TGGTGATG 1380 

CAGATAGACA GTTTC C CACA GTACAGTCCA AAAATCCAGC TOGTOrGTTG CAGAGCCATA 1440 

(3TQAA00TGT TOCTAGCTTT GGCAOCAAAA GOGCCAGTTC TCAlOGA A TTQ CATTAGTACT 1500 

QTGOTGCATC AGGGTTTAAT CA6AATAT6T TCTAAACCAO TOGTCCTTCC AAA666C0CT 1560 

GAGTCTGAAT CTGAAGACCA COGTGCTTCA GGGGAAGTCA GAACT6GCAA ATGGAAGGTG 1620 

CCCACATACA AAGACTACGT GGATCTCTTC AGACATCTCC TGAGCTCTGA CCAGATGATG 1680 

6ATTCTATTT TAGCAGATGA AGCATTTTTC TCTGTGAATT CCTCCAGTGA AAOTCTGAAT 1740 

CAmACTTT AT6ATGAATT TGTAAAA!TCX: GTTTTGAAGA TTGTTGAGAA ATTOGATCTT 1800 

ACACTTGAAA TACAGACTGT TGOGGAACAA GAGAAT66A6 AT6AG6CGCC TG6TGTTTGG 1860 

ATQATCCCAA CTTCAGATCC AGCX3GCTAAC TTGCATCCAG CTAAACCTAA AGATTTTTCG 1920 

GCTTTCATTA ACCTGGTGGA ATTTTGCAGA GAGATTCTC3C CTGAGAAACA AGCA6AATTT 1980 

TTTGAACCAT GGGT6TACTC ATTTTCATAT GAATTAATTT TGCAATCTAC AAGGTTGCCC 2040 

CTCATCAGTG GTTTCTACAA ATTGCTTTCT ATTACAOTAA GAAATQCCAA OAAAATAAAA 2100 

TATTTOQAGG GAGTTAOTCC AAAGAGTCTG AAACACTCTC CTGAAGACCC AGAAAAGTAT 2160 

TCTTGCTTTG CTTTATTTGT GAAATTTG6C AAAGAGGTGG CAGTTAAAAT GAAGCAG771C 2220 

AAAGATGAAC TTTTGGCCTC TTGTTTGACC TTTCTTCTGT CCTTGCCACA CAACATCATT 2280 

GAACTC3GATG TTAGA6CCTA CXSTTCCTGCA CTGCAGAT6G CUTCAAACT GGGCXTTGAGC 2340 

TATACOOCCT TGGCAGAAGT AQQCCT6AAT GCTCTAGAAO AATOGTCAAT TTATATTGAC 3400 

A6ACATGTAA TGCAGCXnTA TTACAAAQAC ATTCTCCC C T GCCTGQATG6 ATAOCTGAAO 2460 

ACTTCA6CCT TGTCAGATGA GACCAAGAAT AACTGGGAAG TGTCAGCTCT TTCTCGGGCT 2520 

GCCCAGAAAO GATTTAATAA AGTGGTGTTA AAGCATCTGA AGAAGACAAA GAACCTTTCA 2580 

TCAAA06AAG CAATATGCTT AGAAGAAATA AGAATTAGAG TAGTACAAAT GCTTOGATCT 2640 

CIAGGAG6AC AAATAAACAA AAATCTTCIG ACAGTCAGGT OCTGAOATGA QAIGATGAAO 3700 

AGCTAT6TGG CCTGGOACAO ASAGAAG0G6 CTGAGCTTTG CAOTGOCCTT TAOAOAGATO 2760 

AAAOCTGTCA TTTTCCTGGA TGTX3TTCCTG CCTOGAGTCA CAGAATTA6C GCTCACAGCC 2820 

AGTGACAGAC AAACTAAAGT TGCAGCCTGT GAACTTTTAC ATAGCATGGT TATGTTTAT6 2680 

TT66GCAAA6 CCAOGCAGAT GOCAGAAGGG G6ACAGGGAG CCCXAOCCAT GTAOCAGCTC 2940 

TATAAGOGGA OLTlTrOCl'OT 6CTGCTTC6A CTT60STGTG ATGTTQATCA GGTGACAAGO 3000 

CAACT6TATG AGCCACTAGT TAT6CA6CT6 ATTCACTGGT TCACTAACAA CAAGAAATTT 3060 

6AAAGTCAGG ATACTGTTGC CTTACTAGAA GCTATATTGG ATGGAATTGT GGACCCTGTT 3120 

GACAGTACTT TAAGAGATTT TTGTOGTOSG TGTATTCGAG AATTCCTTAA ATGGTCCATT 3180 

AAG CAAAT AA C ACCAC AGCA GCSUK3AGAAG AGTCCAGTAA AC ACCAAA TC GCTTTTCAAG 3240 

OGACTTTATA GCCTTGGGCT TCATOCCAAT GCTTTCAAGA GGCTGGGAGC ATCACTT6CC 3300 

TTTAATAATA TCTACAGGGA ATTCAGGGAA GAAGAGTCTC TGGTGGAAGA G T l ' TGlOTfT 3360 

GAAGCCTTGG TGATATACAT GGAGA6TCTG GCCTTAGCAC AT6CA6ATGA GAAGTCCTTA 3420 

GGTACAATTC AACAGTOTTG TGATGCCATT GATCACCTAT GCC6CATCAT TGAAAACaAG 3460 

GATGTTTCTT TAAATAAAGC AAAGAAACGA OGTTTGCOGC GAGGATTTC C ACCTTCOGCA 3540 

TCATTGTGTT TATTGGATCT G6TCAAGTG6 CTmAOCTC ATTGTGOGAG GCCCCAGACA 3600 

6AATGTCGAC ACAAATCCAT TGAACTCTTT TATAAATTOQ TTOCTTTATT CCCAOGCAAC 3660 

AGATCCCCTA ATTTGTGGCT GAAAGATGTT CTCAAOGAAO AAGGTtSTCTC TTTTCTCATC 3720 

AACACXrrrrG AGGQGGGTGG CTGTGGCCAG CCCTCGGGCA TCCTGGCCCA GCCCACCCTC 3780 

TTGTAOCTTC 6GGG6CCATT CA6CCTGCAG G0CA06CTAT GCTGGCTGGA CCTGCTOCTG 3840 
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GCOGOSTTGG ASTGCTftCAA CAOGTTGATT GGCGAGASAA CTQTAGGAGC GCTCCAGGTC 3900 

CTA6QTACT0 AAGCCCAGTC TTCACTTTTQ AAA6CAGT6G CrfTCriVrr A6AAAGCA7T 3960 

GCCATGCATG ACATTATAGC AGCAGAAAAG TGCTTTGGCA CTGGGGCA6C AGGTAACAGA 4020 

ACAAGCCCAC AAGAGGGAGA AAGGTACAAC TACAGCAAAT GCACCGTTGT GGTCOGGATT 4080 

AT6GAGTTTA CCAOGACTCT GCXAAACACC TCCCCGGAA6 GATGGAAGCT CCXGAAGAA6 4140 

GACrTGTGTA ATACACAOCT GATGA6AGTC CTGGTGCAGA OGCTGTGTGA GOOGGCAAGC 4200 

ATAOGTTTCA ACAT0G6AGA 06TCCA0GTT ATGGCTCATC TTCCTGATGT TTGTGTGAAT 4260 

CTGATGAAAG CTCTAAAGAT GTOXCATAC AAAGATATCC TAGAGACCCA TCTGAGAGAG 4320 

AAAATAACAG CACAGAGCAT TGAGGAGCTT TGTGCOSTCA ACTTGTATQG CCCTGAOSCG 4380 

CAAGT06ACA G6AGCAG6CT G8CT6CZOTT GTCTCTGOCT GTAAACA6CT TCACAGAGCT 4440 

GQQCTTCTGC ATAATATATT ACOGTCTCAG TCCACAGATT TGCATCATTC T6TTO6CACA 4500 

GAACrrCTTT CCCTGGTTTA TAAAGGCATT GCCCCTGGAG ATGAGAGACA GTGTCTGCXn? 4560 

TCTCTAGACC TCAGTTGTAA GCAGCTGGCC AGOGGACTTC TGGAGTTAGC CTTTGCTTTT 4620 

GGAOGACTGT GTGAGCX3CCT TGTGAGTCTT CTCCTGAACX: C3V0CGGTGCT GTCCACGGCG 4680 

TCCTTGGGCA GCTCACAGGG CAGCGTCATC CACTTCTCCC ATGGGGAGTA TTTCTATAGC 4740 

TTGTTCTCAG AAACGATCAA CACX3GAATTA TTGAAAAATC TGGATCTTGC TGTATTGGAG 4800 

CTCATGCAGT CTTCAGTGGA TAATACCAAA ATGGTGAGTG C0C3TTTTGAA CGGGATGTTA 4860 

GACCAGAGCT TCAGGGAGCG AGCAAACCAG AA^CACCAAG 6ACTGAAACT TGOQACIACA 4920 

ATTCTGCAAC ACTGGAAGAA GTGTGATTCA TGGTGGGCCA AAGATTCCCC TCTOGAAACT 4980 

AAAATGGCAG TGCTGGCCTT ACTGGCAAAA ATTTTACAGA TTGATTCATC TGTATCTTTT 5040 

AATACAAGTC ATGGTTCATT CCCTGAAGTC TTTACAACAT ATATTAGTCT ACTTGCTGAC 5100 

ACAAABCTGG ATCTACATTT AAAGGGCCAA GCTGTCACTC TTCTTCCATT CTTCACCAGC 5160 

CTCSICTOGAG GCAGTCTGGA GGAACTTAGA CGTGTTCTGG AGCAGCTCAT CGTTGCTCAC 5220 

TTCOOCATGC AGTCCAGGGA ATTTCCTCCA QGAACTCCGC GGTTCAATAA TTATGTGGAC 5280 

TGCATGAAAA AGTTTCTAGA TGCATTGGAA TTATCTCAAA GCCCTATGTT GTTGGAATTG 5340 

ATGACAGAAG TTCTTTGTCG GGAACAGCAG CATGTCATGG AAGAATTATT TCAATCCAGT 5400 

TTCAGGA6GA TTGCCAGAA6 GGGTTCATG7 GTCACACAAG TAGGCCTTCT G6AAAG0GTG 5460 

TATGAAATGT TCAGGAAQGA TGACCCCOGC CXAAGTTTCA CACGCCAGTC CTTTGTGGAC 5520 

OGCTCCCrCC TCACTCT6CT GTG6CACTGT A6CCTGGAT6 CTTTGAGA6A ATTCTTCAGC 5580 

AOUVTTGTGG TGGATGCCAT TGATGTGTTG AAGTCCAGGT TTACAAAGCT AAATGAATCT 5640 

ACCTTTGATA CTCAAATCAC CAAGAAGATG GGCTACTATA AGATTCTAGA CGTGATGTAT 5700 

TCTCGCCTTC CCAAA6ATGA TGTTCATQCT AAGGAATCAA AAATTAATCA AGTTTTCCAT 5760 

06CTCGTGTA TTACAGAAG6 AAATQAACTT ACAAA6ACAT TGATTAAATT GTGCTAOGAT 5830 

GCATTTACAG AGAACAI06C AGGAGA6AAT CA6CT6CTG6 AGAGGAGAA6 ACTTTACCAT 5880 

TGTGCAGCAT ACAACTGOSC CATATCTGTC ATCTGCTGTG TCTTOUITGA 6TTAAAATTT 5940 

TACCAAGGTT TTCTGTTTAG TGAAAAACCA 6AAAAQAACT TGCTTATTTT TGAAAATCTG 6OO0 

ATCGACCTGA AGOGCOGCTA TAATTTTCCT GTAGAAiSTTG AGGTTCCTAT GGAAAGAAAS 6060 

AAAAAGTACA TTGAAATTAG 6AAAGAAGGC AGAGAAGGAG CAAAT60GGA TTCAGAltSGT 6120 

CCTTCCTATA TGTCTTCCCT GTCATATTTG 6CAGACAGTA CCCTGAGTGA GGAAATGAGT 6180 

CAATTTGATT TCTCAACOGG AGTTCAGAGC TATTCATACA GCTCCCAAGA CCCTA6ACCT 6240 

GCCACTGGTC GTTTTCGGAG ACGGGAGCAG CGGGACCCCA CGGTGCATQA TGATGTGCTG 6300 

GAGCTGGA6A T6GA0GAGCT CAATCG6CAT 6AGTGCATGG 06CCCCT6AC GGCCCTGGTC 6360 

AAGCACATGC ACAGAAGOCT GGGCCOSCCT CAAGOASAAG AGGATTCAGT 6CCAAQA8AT 6420 

CTTCCTTCTT GGATGAAATT CXTTCCATOGC AAACTGG6AA ATCCAATAGT ACCATTAAAT 6480 

ATCCGTCTCT TCTTAGCCAA GCTTGTTATT AAXACAGAA5 AGGTCTTTCG CCCTTACGCG 6540 

AAGCACTGOC TTA6CCCCTT GCTGC3W5CTG GCTGCTTCTG AAAACAATGG AGGAGAAGGA 6600 

ATTCACXACA TGGTGGTTGA GATA6TGGCC ACTATTCTTT CAT6GACASG CTTGGCCACT 6660 

CCAACA8GGS TCCCTAAA6A TGAAGTGTTA 6CAAAT0GAT TOCTTAATTT OCTAATGAAA 6720 

CATGTCTTTC ATCCAAAAAO AGCTGTGTTT AGACACAACC TTQAAATTAT AAAGACCCTT 6780 

GTCX3A6TGCT 6GAAGGATTG TTTATCCATC CCTTATAGGT TAATATTTGA AAAGTTTTCC 6840 

GGTAAAGATC CTAATTCTAA AGACAACTCA GTAGGGATTC AATrSCTAGG CATCGTGATG 6900 

GCCAATGACX: TGCCTCCCTA TGACCCACA6 TGTGGCATGC AGAGTAGOQA ATACTTCCA6 6960 

GCTTTOGTGA ATAATATGTC CTTTGTAAGA TA7AAAGAA0 TGTATG008C TQCAGCAGAA 7020 

GTTCTAG6AC TTATACTTCG ATATGTTATG GAGA6AAAAA ACATACTGGA GGAGTCTCT6 7080 

TGTQAACTGG TTGOSAAACA ATTGAAOCAA CATCAGAATA CTATGGAG6A CAAGTTTATT 7140 

GTGTGCTTGA ACAAAGTGAC CAAGAGCTTC CCTCCTCTTG CAGACAGGTT CATGAATGCT 7200 

GTGTTCTTTC TGCT6CCAAA ATTTCATGGA GTGTT6AAAA CACTCTGTCT GGAQGTGGTA 7260 

CTTT6TC6TG TGGAG6GAAT GACAGAGCTG TACTTCCAGT TAAAGAGCAA OGACTTOGTT 7320 

CAAGTCATGA GACATAGA6A TGATGAAAGA CAAAAAGTAT GTrTGQACAT AATTTATAA6 7380 

ATGATGCCAA AGTTAAAACC AGTAGAACTC CGAGAACTTC TGAACCCCGT TGTGGAATTC 7440 

GTTTCCCATC CTTCTACAAC ATGTAGGGAA CAAATGTATA ATATTCTCAT GTGGATTCAT 7500 

6ATAATTACA GA6ATCCAC3A AA6TGAGACA GATAATGACT CCCA6GAAAT ArTTAAGTTG 7560 

OCAAAASATG TGCTGATTCA AGGATTGATC GATGaOAACC CTG6ACTTCA ATTAA3TATT 7620 

06AAATTTCT GGA6CCATGA AACTAGGTTA CCTTCAAATA CCTTGGACCG GTT6CTG6CA 7680 

CTAAATTCCT TATATTCTCC TAAGATAGAA GTGCACTTTT TAAGTTTAGC AACAAATTTT 7740 

CTGCTOGAAA TGACCAGCAT GAGCCCAGAT TATOCAAACC OCATGTTaSA GCATCCTCTG 7800 

TCACSAATGOS AATTTCAGGA ATATACXSVTT 6ATTCTGATT GGG6TTTC06 AA6TACTGTT 7860 

CrcACTOOQA TOTTTGTOQA GACCCA8GCC TCOCAGOGCA CTCTCCAGAC CX3G1ACCCA0 7920 

GAA06GTCCC TCTCAGCTOQ CTGGOCAGTG 6CA866CAGA TAA6GGCCAC CCAGCAGC310 7980 

CATGACTTCA CACTGACACA GACTGCAGAT GGAAGAAGCT CATTTGATTG GCTGAC0GG6 8040 

AGCAGCACTG ACCX^CTGGT CGACCACACC AGTCCCTCAT CTGACTCCTT GCTGTTTGCC 8100 

CACAAQAGGA GTGAhAOGTT ACAGA8AGCA CCCTTGAAGT CAGTG6G6CC TGATTTTCGG 8160 

AAAAAAAOSC It aG GCX T fO C AOOGGACGAO GTGGATAACA AAGTGAAAOS TGGGGCCG6C B220 

OGGAOGGACC TACTAOBACT G06CAGACGG TTTAT6AGG0 ACCAQ6A6AA GCTCAGTTT6 6280 

AT6TATGCCA GAAAA6G0GT TGCTGAGCAA AAAOGAGAGA A6GAAATCAA GAGTGAGTTA 8340 

AAAATGAAGC AGGATGCXXZA GGTOGTTCTG TACAGAAGCT ACOGGCAOGG AGACCTTCCT 8400 

GACATTCAGA TCAAGCACA8 CAGCCTCATC ACCCOGTTAC AGGCOGTGGC CCAGAGGGAC 8460 

CCSU^AATTG CAAAACAGCT CTTTAGCAGC ' ITUTm ' Ci ' G 6AATTTTGAA AGAGATOGAT 8520 

AAATTTAAGA CACTGTCTSA AAAAAACAAC AIXACTCAAA AGTT6CTTCA ASACTTCAAT 8580 

C3GTTTTCTTA ATACCACXOT CTCTTTCTTT CC3VCCCTTTG TCTCTTGTAT TCAGGACATT 8640 

AGCTGTCAGC AOGCAGOCCT GCTGAGCCTC 6ACCCAG0GG CTGTTAGOGC TGGTTGCCTG 8700 

GCCAGOrrAC AGCAGCCXX5T GGGCATCCGC CTQC1A6AGG AGGCTCT6CT CCGCCTGCTG 8760 

CCTGCTGAGC TGCCTGCCAA GCGAGTCCGT GGGAAGGCCC GCCTCCCTCC TGATGTCXrrC 8820 

AGAIGGGTGG AGCTTOCTAA QCTGTATAGA TCAATTGGAG AATACGACX5T CXTCXX3TGGG 8880 

ATTTTTACCA GTGAGATAGG AACAAAGCAA ATCACTCAGA GTGCATTATT AGCAGAAGCC 8940 

AGAAGTOATT ATTCTGAA8C TGCTAAGCAG TATGAT6AGG CTCTCAATAA ACAAGACTGG 9000 

GTA6ATGGTG A6CCCACA6A AGCGQAGAAG GArrrTTGGG AACTTGCAIG CCTTGACTGT 9060 
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TAOUVCCACC 
6AGAACCCCC 
CCTTACATGA 
CTGACATTTA 
TACAGTCAAG 
TAlOVTTCAAA 
CACCAAAGTA 
ATCAGCTTTA 
AACACCTGGA 
ATCATCACAA 
GAAOATAATA 
6A6CAGGAAG 
ATGATAGACA 
CTGCATAAAG 
GGGCTGAGCC 
AAAACAGTCT 
GCTTTCOGTG 
AGCA6TGAGC 
CTTTCTGGAT 
TTCCAGC3UX: 
AGCTGTGGGC 
CAACAGCTGC 
TATGCAGCAC 
AGATTGAAGT 
CTCATGACAA 
ATGGTGGCCT 
ACTGATAACT 
TTCAAG6ATA 
TTC3GA7CAA6 
GAACTGCTCT 
AATAAAAAAA 
GCTCCA6GCC 
AAACATTTTS 
ATTACCAACA 
QAATGTTCAC 
COOGSTCAGT 
TTTGATGAGC 
G6CCATGAGS 
.CAGOGCGTGO 
AGCCAGAGGG 
TTAATTGAGT 
CAAGAGGAGA 
TGGCTGACAA 
GCTAATCGTA 
CTCTTAAAGC 
TCCCACTTOG 
GACAGACATC 
TTTG6GCAIG 
OGOCTAACTC 
AGCATCAT6G 
AT6GATGTX3T 
AAAAAAG6AG 
CASAAAATAT 
GATGAGCTAC 
C6A6GAAGCA 
ACPCAAGrrGA 
GAAGGATGGG 
6TTTAAAGAA 
CTAAAGAGAA 
TGAGTAAATG 
AGGTTTATAQ 
TGATCA6CTT 
TGGA6GAAAT 
CTCAGAAGGC 
TTGTAGAAOC 
TTAGAAATGA 
TTCTCTTCTA 
TAOGAGGGCA 
GATACATAAA 
TAGGGATA6T 
ATTCCTCATT 
ACAAAAGTGG 
AACTGTTTCT 
TTTG6AGGCT 
CCAAAAGTA 



TTGCTGAGTG 
CAGACCTAAA 
TCC3GC3VGCAA 
TTGACAAAGC 
AGCTGAGTCT 
ATGGCATTCA 
GACTCACCAA 
TAAGCAAACA 
CAAACAGATA 
AT0QATX3TTT 
GTATGAAT6T 
AAGATATCAG 
GTGCCOGGAA 
AGTCAfiAAAC 
ACTGCOGGAG 
CTTTGTTGGA 
ACCAGAACAT 
CAGCCTGCCT 
CCAGTTCAGA 
TCTCTGAGGC 
CTGCAGCTGG 
GCAAGGAGGA 
TTGTGGTGGA 
TTCCTAGATT 
AAGAGATCTC 
TACTGGACAA 
ACCCGCAGGC 
CTTCTACTGG 
GAGGAGT6AT 
TTAAGGATTG 
ACATTGAAAA 
TGGGGGCCTT 
GQAAAOGAGG 
TOCTACTTTT 
CCTGGAT6AG 
ATGACGGTAG 
GG6TQACAGT 
AGAGGGAACA 
A6CA6CTCTT 
CCCTGCAGCT 
GGCTTGAAAA 
A66CG6CTTA 
AAATGTCAGG 
CTGAAACAGT 
GGGCCTTCGT 
CCAGCTCTCA 
TGAACAACTT 
OQTTTGGATC 
GCCAGTTTAT 
TACACGCACT 
TTGTCAA6GA 
GGTCATGGAT 
GTTA06CTAA 
TCCTG6GTCA 
AA6ATCACAA 
AGTGCCTGAT 
AGCCCTGGAT 
TCTACTATAC 
ATGTCTTTTG 
TGTATQQGTT 
AAAGATAGAT 
TCAAAGCATT 
GT666GAAGC 
TTCATCACCA 
A0CATAG6AA 
CTGCATTTQA 
GTTTTGACAT 
AAAATTTTGG 
AGTGCTTT6C 
ACTAAGCATT 
TGGAGGAAAA 
CTCCTTCCCA 
GATTOGCTTT 
CTTCTGTGAT 



GAAATCACTT 
TAAAATCTG6 

6CTGAAGCTG 
TATGCACGGG 
GCTTTACCTC 
GAGTTTTATG 
ATTGCAGTCT 
AGGCAATTTA 
TCCAGATGCT 
CTTTCTCAGC 
QGATCAA6AT 
CTCCX2TGATC 
GCAGAACAAT 
CAGAGACGAT 
CXX3GTCXXAG 
TGAGAACAAC 
TCTCTTGGGT 
TGCTGAAATC 
GGATTCAGAG 
TGTGCAGGOG 
GGTGATTGAT 
AGAGAAT6CA 
GAAAATGTTG 
ACTTCAGATT 
TTCCGTTCCC 
A6ACCAAGCC 
TATT6TTTAT 
TCAIAAGAAT 
TCAAGATTTT 
GAGCAATGAT 
AATGTATGAA 
TAGAAGGAAG 
7TCTAAACTA 
AAAAAT6AAC 
CGACTTCAAA 
GGGAAAGCCA 
CAT6GGGTCT 

cccrrnxTQ 

CCA86TCATG 
GAGGACCTAT 
TACTGTTACC 
CCTGAGTGAT 
AAAACATGAT 
C3V0GTCTTTT 
GAGGATGAGT 
CGCTCTGATA 
TATGGTGGCC 
GGCTACACAG 
CAATCTQAT6 
CCGGGCCTTC 
GCCCTCCTTT 
TCAAGAAATA 
6AGAAAGTTA 
TGAGAAGGCC 
CATTOGTGCC 
GGACX^kGGCA 
GTGAG6TCTG 
TTT6CTTG6C 
T6CTACAGTT 
AAATCAAAGA 
ATCCAGGCTT 
TACAAGTGCT 
CTT6GAATGC 
AGATTTTGGG 
CAATAAGAAC 
TATTTTAGGA 
TTTATGATAG 
TCATAGCATT 
ATTGAATTTG 
TCAGTTCCAG 
AAAGCATGCA 
TGTGCAGTCC 
TAiSCTTTTTG 
TTT6AGAAGT 



GAATACTGTT 
AGTGAACCAT 
CTGCTCCAGG 
<SU3CTCCAGA 
CIGCAAGATG 
CAGAATTATT 
GTACAGGCTT 
TCATCTCAAG 
AAAATGGACC 
AAAAt AGAGG 
OSAGACCCCA 
AGGAGTT6CA 
TTCTCACTTG 
TGGCTGGTGA 
GGCTGCTCTG 
GTGTCAAGCT 
ACAACTTACA 
GAGGAGGACA 
AAGGTGATCS 
GCTGAGGAGG 
GCTTACATGA 
tcagttattg 
AAAGCTTTAA 
ATAGAAC3GGT 
T6CTGGCAGT 
GTTGCT6TTC 
CCCTTC31TCA 
AAGGAGTTTG 
ATTAAT6CCT 
GTAAGAGCTG 
AGAATGTATG 
TTTATTCAGA 
CT6A6AA1CA 
AAAGACTCAA 
GTGGAGTTOC 
TTGCCAGAGT 
CTGOQAAGGC 
6TGAA6GGTG 
AATGGGATOC 
AGOGTTGTGC 
TTGAAQGACC 
CCCAGGGCAC 
GTTGGAGCTT 
AGAAAA06AG 
ACAAGCCCTG 
TGCATCAGCX 
ATGGAGACTG 
TTTCT6CCAG 
TTACCAATGA 
CGCTCAGACC 
GA7TGGAAAA 
AA1X3TTQCTQ 
QCAGGTGCCA 
CCTGCCTTCA 
CAAGAACCA6 
ACAGACCCCA 
TGGGAGTCTG 
AGCATTCCAT 
TCGTAGCATG 
TAAGGTTATA 
ACCAAAGTAT 
GCAA6TTAGT 
OCTTCTGOTT 
AGA6TAAA0C 
AATAGGTAAA 
TATTTTTCTA 
ATTTGCrCTC 
CACTTTTGCT 
GGATAACTTC 
GAGAATAAAA 
TTCTAGCACA 
CXGTCGOCCC 

TTGTTrmr 

AXACTCTTQA 



CTACAGOCAO 
TTTATCAGGA 
GAGAGGCTGA 
AGGOGATTCT 
ATGTTGACAG 
CTAGTATTGA 
TAACAGAAAT 
TTCCOCTTAA 
CAATGAACAT 
A6AAGCTTAC 
6T0ACAGGAT 
AGTTTTCCAT 
CTATGAAACT 
GCTGGGTGCA 
AGCAGGTGCT 
ACTTAAGCAA 
GGATCATAGC 
AGGCTAGAAG 
CGGGTCTGTA 
AGGCCCAGGC 
06CTGGCAGA 
ATTCTGCAGA 
AATTAAATTC 
ATCCAGAGGA 
TCATCAGCTG 
AGCACTCTGT 
TAA6CAGCGA 
TGGCAAGGAT 
TAGATCAGCT 
AACTAGCAAA 
CA6CCTTG66 
CTTTTGGAAA 
A6CTCA6T6A 
AG0CCCCTG6 
TGAGAAATGA 
ACCACX3T6CO 
CCAAGOGCAT 
60GAGGACCT 
T6GCCCAAGA 
CCATGACCTC 
TTCTTTTGAA 
OGCOGTGTGA 
ACATQCTAAT 
AAAGTAAA6T 
AGGCTTTCCT 
ACTGGATCCT 
GCGGCXSTGAT 
TCCCTGASTT 
AA6AAACGG6 
CTGGCCTGCT 
ATTTTGAACA 
AAAAAAATTG 
ATCCAGCA8T 
GAGACTATGT 
AGAGT666CT 
ACATCCTTGG 
CAGATAQAAA 
QAOCTGATTT 
AGTTTAAATC 
GTAACATCAA 
TAAGTCAAGA 
GAAACAGCTG 
CTGOCACATT 
TAAGTATAGT 
GCTATAATTA 
GGTTTTTTCC 
TAGAA6GAAA 
ATTCCAATCT 
AAAAATCCCA 
GAAATTCCTA 
ACAAGATGAA 
COGCCAGTCC 
TTTTOTTCT 
GTGTTTAATA 



TATAGACAGT 
AACATATCTA 
CCAGTCCCTG 
A6AGCTTCAT 
AGCCAAATAT 
TGTCCXCTTA 
TCA66AGTTC 
QAGACTTCTG 
CTGGGATGAC 
CCCTCTTOCA 
GGAAGTGCAA 
GAAAAT6AAG 
ACTGAAGGAG 
GAGCTACTGC 
CACTGTGCTG 
AAATATTCTG 
GAATGCTCTC 
AATCTTAGAG 
CCAGAGAGCA 
TCCCTCCTGG 
TTTCTGTGAC 
ACTGCAGGCG 
CAATGAAGCC 
GACTTT6AGC 
GATCAGCCAC 
GGAAGAAATC 
AAOCTATTCC 
TAAAAOTAAO 
CTCTAATCCT 
AACCCCTGTA 
TGACCCAAAG 
AGAATTTGAT 
CTTCAACGAC 
GAATCTGAAA 
GCTG6AGATT 
AATCGCOGGG 
CATCATCCGT 
60GGCAGGAC 



CAGGTTAGGA 
CACCATGTCC 
ATATAAAGAT 
GTATAAGGGC 
G0CT6CTGAT 



OGGGATTGGA 
OGGGATCXSAC 
GATG CCTTTT 
CCTTATGTAC 
CACCAACACC 
GAAAATGCTG 
GTACOXXXSA 
CATTACTTGT 
GGCT6TGGCA 
TTCAGAAGAG 
CAGAACCTGG 
GCATTACATT 
TGCT6AAACA 
AAGATTATGA 
AGATTAGGTG 
ATATAATATO 
TCTCOGTAAA 
QGAAA6CACA 
TGATGTAACA 
TGGCTTATAT 
TTTCATTTTA 
OGTCTTTATT 
ACAACT6GAA 
TG6TT6TTGT 
TTTGAAATGA 
ATTATGGAAT 
TCCACACCCA 
AACACTTGTA 
AAGTTTTTTT 
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9100 
9240 
9300 
9360 
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10800 
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11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 . 
12420 
12480 
12 540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 



Seq ID NO: 99 Protein sequence: 
Protein Accession #: HP_008835.5 

1 11 21 31 41 51 

1 I I I i I 

MAQSGAGVRC SLLRLQBTLS AADROGAALA GHQLIRGICQ ECVLSSSPAV LALQTSItVPS 60 

RDF6LLVFVR RSLNSZBFRE CRSBZXiKFXiC ZFLEKMGQKI APYSVBIXHT CTSVyTKDRA 120 

AKCKZPAIiDL LZKLLQTFRS SRIMDEPKZG ELFSKFYGEL ALKKRZPDTV I^KVYELLGL 180 

LGEVHPSS4I NNAENLFRAP IX3ELKTQMTS AVREFKliFVL AGCLKGLSSL IiQTFTICSMEB 240 

DPQTSREIFH FVLKAIRPQI DLKRYAVPSA GZALFALBAS QFSTCLLOHY VSLFEVLLKH 300 

CABTNVELKR AALSALBSFL KQVSNMVAKN AQffiXNKLQY FMBQFYGIIR HVDSSQIKELS 360 
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ZAXSGYGLFA QPCKVINAKD VOFMYVBLZQ RGXQNPLTQT OTGDDRVyQH PSFLQSVAS7 420 

ZiLYLDTVPEV YTPVLEBLW MOXDSFPQYS PKMQLVGCRA XVKVFLALAA RGFVLHMCIS 480 

TWHQGIjIRI CSKFWLPKG PESESEDHRA SGHVRTGKWK VFTYKZ7YVDL FRBLLSSDQM 540 

KDSILADEAF PSVNSSSBSL HRLLYDEFVR SVLKIVEKU) LTLBIQTVGB QENGDEAPGV 600 

HMIPTSDPAA NLHPAKPKDF SAFZNLVBFC HEXLPEKQAB FFEPWVYSFS YELILQSTRIi 660 

PLZSGFYKIiL SZTVRKAKKI KYFEGVSPKS LKH5FEDPEK YSCFALFVKF GRBVAVKHKQ 720 

YKDELLASOi TFLLSLPBB7I lELDVRAYVP ALQKAFKLGL SYTPLAEVGL NALSEWSZYI 760 

DHHVMQPYYR DILPCUXTYL BTSALSDETK NNHSVSALSR AAQKGFNKW LRHLKKTXNL 840 

SSNEAISLEB IRIRWQMLG 8Ii6GQINEI7Xj LTVTSSDEKM KSYVAKDREK RLSFAVPFRB 900 

MKPVIFtiDVF LPRVTELALT ASDRQTKVAA CEIiXaSMVKF MLGKATQMPE G6QGAPPMYQ 960 

LYXRTFPVLZ* RLACDVDQVT RQLYEPLVMQ LIHMFTNNKR FBSQinVAU. SUUXSZVDP 1020 

VDSTLRDFOG RCZREFLKNS IKQITFQQQB KSFVMTKSLP KRLYSLALHP 2ZAFKRLGASL 1060 

APHNIYRKFR EEBSLVBQFV PEALVIYNES LALAHADEKS LGTIQQCCDA IDHLCRIIEK 1140 

KHVSLNKAKK RRLFRGFPPS ASLCLLDLVK HLLAH06RPQ TBCREKSIEX> FYKFVPLLPG 1200 

HRSPNLWLKD VLKEEGVSFL tNTFBGGGCG QPSGILAQPT LLYLRGPPSL QATLCHLDLL • 1260 

LAALECYNTP IGERTVGALQ VLGTEAQSSL LKAVAPFLES lAMHDIIAAE KCFGTGAAGS 1320 

RTSPQEGEHY NYSKCTWVR IMEFTTTLUI TSPBGHKUiK KDW3JTHU4R VLVQTLCEPA 1380 

SIGFNIGDVQ VMAHLPDVCV MLMKALKMSP YKDILETHLR EKITAQSIEE LCAVNLYGPD 1440 

AQVDRSRLAA WSACKQLHR AGLLBHILPS QSTOLHHSVG TBLLSLVYKG lAPGDERQCL 1500 

P5LDL5CKQL ASGLLELAFA PGGLCERLVS IiLLHPAVLST ASLGSSQG?V IHFSHGEYFY 1560 

SLFSETINTB LLKNLDLAVL HIJ4QSSVnt7T XMVSAVU)Q4 U3QSFRERAK QKHCX3LKLAT 1620 

TILQHWKKCD SWWAKDSPLE TKMAVLALLA KILQIDSSVS FHTSHGSFPB VFTTYISLLA 1680 

DTKLDLBUCG QAVTLLPFFT 8LTG6SLEEL RRVLBQLIVA EFPHQSREFP PGTPRFMNYV 1740 

Da4KKPliZ>AL BLSQSPMLLE UfTEVLCREQ QHVMEELFQS SFRRIARRGS CVTQVGLLES 1800 

VYEKFRKDDP RLSFTRQSFV DRSLLTLLHH CSLDALREFF STIWDAIDV LKSRFTKLNE 1860 

STFDTQITKK MGYYKILDVM YSRLPKDDVH AKESKINQVP HGSCITEGNB LTKTLIKLCY 1920 

DAFTENKA6E NQIiLERRRLY BCAAYNCAIS VICCVFNSUC FYQGFLFSER PEKKLLIFEH 1960 

LZOIiKRHW PVEVEVPMER KKnriBZRKB AREAAUODSD GPSYKSSLSY LADSTLSSEM 2040 

8QFDFST6VQ SYSYS8QDPR PAT6RFRRRB QRDPTVHDDV LBLEMDEUNR HBCMAFLTAL 2100 

VKHMHRSLGP PQGEEDSVPR DLPSWMKFLH GKLGtiPIVPl, HIRLFIAKLV INTEEVFRPY 2160 

AKHWLSPLLQ LAASENNGGE GIHYMWBIV ATILSWTGLA TPTGVPKDEV LANRLLKFLM 2220 

KHVFHPKRAV FRHNLBII2CT LVECKKDCLS IPYRliIFEKF SGKDPNSKDK SVGIQLLGIV 2280 

MANDLPPYDP QGGIQSSEYF QALVNKMSFV RYKEVYAAAA BVIiSLIIiRYV NBRIQIILBES 2340 

LCELVAKQUC QHQMTMEDKP ZVCLHKVTXS FPPLADRFMN AVFFLLPKFH GVLKTLCLEV 2400 

VLCRVB6MTB LYFQLKSKDP VQVMRHRODB RQKVCLDIIY KMMFKLKPVE LRELLHPWB 2460 

FVSEPSTTCR BQMYNZLMHI HDNYRDPBSE TDHDSQBIPK LAXDVLIQGL IDQ7P6LQLZ 2520 

IWFKSStBTR LPSHTLORXiZi ALtXSLYSPKI EVRFLSLATV FLLSCrSMSP OYPNPMFEHP 2580 

LSECEFQEYT IDSDWRFRST VLTPKFVBTQ ASQGTLQTRT QEQSLSARHP VAOQZRATQQ 2640 

QBDFTLTQTA DGRSSPDHLT GSSTDPLVDH TSPSSDSIiliF AHKRSERLQR APLKSV6PDF 2700 

6KKRLGLPGD EVDNKVKGAA GRTDLLSIiRR RFMRDQEKLS LMYARKGVAB QKREKEIKSE 2760 

LKKKQDAQW LYRSYRHGDL FDIQIXESSL ITPLQAVAQR DPIIAKQLFS SLFSGILKEN 2820 

DKFKTLSEKS KXTQKLLQDF KRFLNTTPSF FPPFVSCIQD ISOQHAALLS LDFAAVSAGC 2660 

LASLQQPVGI RUiESALLRL LPAELPAKRV RGKARLPPDV UWVSLAKLY RSZGEYDVLR 2940 

6ZFTSEX6TK QZTQSALLAE ARSDY8EAAK QYDBALNKQD WVDGEPTEAE EOSFWELASLD 3000 

CYNHIiAEWKS LEYCSTASID EEtTPPDUTKI HSEPFYQETY LPYMZRSKZ<K LLIiQGEADQS 3060 

LLTFIDKAMH GELQKAILEIj HYSQELSLLY LLQSDVDRAK YYIQNGIQSP MQNYSSIDVL 3120 

LHQSHLTKLQ SVQALTBIQB FZSFZSKQOf LSSQVPLKRL UmmiRYPD AKMDPMNXWD 3180 

DZZTNRCPFL SKZESKLTPL PEDUSMHVDQ DGDPSDRNBV QEQSEDZ8SL ZRSCKFSMKM 3240 

KMZDSARRQV NF8LAKKLLK ELHKBSRTRD DHLV8WVQ8Y CRLSKCRSRS Q6CSBQVL7V 3300 

LKTVSLLDEK NVSSYl>SKKI LAFRDQNILL GTTYRIIANA LSSEPACLAE lEEDKARRlL 3360 

ELSGSSSEDS EKVXAGLYQR AFQHLSEAVQ AAEEEAQPPS NSCGPAAGVI DAYMTLADFC 3420 

DQQLRKEEEN ASVZDSAELQ AYPALWEKM LKALKLNSNE ARLKFPRLLQ XZERYPBETli 3460 

SIMTKBZSSV PCHQFZStfZS HMVALLDXDQ AVAVQBSVBB ZTDmrPQAZV YPF1Z8SE5Y 3540 

SFKDTST6BK NKSFVARZKS KLDQG6VZQD PZHAIiDQLSK PELLFKDWSir DVRAEIiAKXP 3600 

VNKKNIEKKY ERMYAAIiOJP KAPGL6AFRR KFIQTFGKEP DXHFGKGGSR LLRMKLSDFN 3660 

DITHMLLLKM NKDSKPPOIL KECSPWMSDF KVEFIiRNELB IPGQYDGRGK PLPEYHVRIA 3720 

6FDERVTVMA SLRRPKRIZZ RGBDBREBPF LVK66EDLRQ DQRVEQLFQV MNGXLAQDSA 3760 

CSQRALQIAT YSWPNTSRL GLZEWLEHTV TLKDUiUITM 8QEEKAAYLS DPRAPPCSyX 3840 

DWLTKM6GKH DVGAYMUmC GANRTSTVTS FRXRESKS^A DLLKRAFVRM STSPEAFLAL 3900 

RSHFASSBAL ICISHWILGI GDRBX^INFMV AMETGGVIGI DFGHAFGSAT QFLFVPELMP 3960 

FRLTROFIUL MLPMKBTQLM YSIWVHALRA PRSDPGLLTO TODVFVKEPS FDWKNPEQKM 4020 

LKKG6SWIQE ZNVABKMWYP RQKJCYAKRK LAGANPAVZT CDBLLLGHER APAF8DYVAV 4080 
AROSKDBNZR AQEPSSGLSB ETQVKCLNDQ ATDFNZL6RT NBGHEPWM 

Seq ZD NO: 100 DNA sequence 
Nucleic Acid Accession fit NK_000673 
Coding sequence: 101*1225 

1 11 . 21 31 41 • 51 

I i I I I i 

ATGTGAAGGC ACAAGCTGCT GTTATATACA ACaiGAGTOAA CTGAQCATCA GTCAGAAAAA 60 

GTCTATGTTT GCAGAAATAC AGATCCAAGA CAAAGACAGG ATGQGCACTQ CTGGAAAAGT 120 

TATTAAATGC AAAGCAGCTG TGCTTTGGGA GCAGAAGCAA CCCTTCTCCA TTGAGGAAAT 180 

AGAA6TTG0C CCACCAAAGA CTAAAGAAGT TCGCATTAAG ATTTTGGCCA CAGGAATCTG 240 

T06C31CAGAT 6ACCATGTCA TAAAAGQU^ AATG6T6T0C AAGTTTCCA6 TGATTGTGGG 300 

ACATGAGGCA ACTGGGATTG TAGAGAGCAT T66AGAAGGA GTGACTACA6 TGAAACCAGG 360 

TGACAAAGTC ATCCCTCTCT TTCTGCCACA ATGTAGAQAA TGCAAT6CTT GTCGCAACCC 420 

AGATQGCAM: CTTTGCATTA GGAGOSATAT TACTGGTCGT GGAGTACTGG CTGATGGCAC 480 

CACCAGATTT ACATGCAAGG GCAAACCAGT ACACCACTTC ATGAACACCA GTACATTTAC 540 

OSAGTACACA GTQGT GG ATG AATCTTCTGT TGCTAAGATT GATGATGCAG CTCCTCCTGA 600 

GAAAGTCTGT TTAATTG6CT G'A' UUtf r m 'C CACTG GATAT G606CTGCTG TTAAAACTGG 660 

CAAGGTCAAA CCTGGTTCCA CTTGOaTOGT CTTT6GCCT0 GGAGGAGTTO GCCTGTCAGT 720 

CATCATGGGC TGTAAGTCAG CTGGTGCATC TAGOVTCATT GGGATTGACC TCAACAAAGA 780 

CAAATTTGAG AAGGCCATGG CTGTA6GT6C CACTGA6TGT ATCAGTCCCA AGGACTCTAC 640 

GAAAOOCATC AGTGAOOTOC TGTCAGAAAT GACAGGCAAC AAQSTGCSGAT ACACCTTT6A 900 

AGTTATT6Q0 CATCTTGAAA CCAT6ATTGA TGC0CTG6CA TCCTGCCACA TGAACXATGS 960 

GACCA&OGTG GTTG7AGGAG TT O CTO CA TC AG0CAAGA1Q CTCACCTAIQ AOCOaAtGTT 1020 
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GCTCTTCACT GGACGCACAT GGAAGGGATG 
TGTOCCAAAA CTA6TGACTG AGTTCCTGGC 

TCATGrrrrA ccatttaaaa aaatcagtga 

CATTOGAACG GTOCTGAOGT TTTGAGATCC 
GAACTGGAGT TTCTCTTGTG A6AGTTCCCT 
ACAAGCA.TAA GTAGAAGAIT TGTTGMGhC 
TATAAACATT TAAAGTCTTG TGAGCACCTG 
TTGATTTACA TTTTGTAAGG CTATAA1TGT 
TATGTTGAAA TCGAGATTTT TAAGAGTTTT 
CAGATATA6C GTATAAAGAT ATA£3TAAATG 
TTGAAACTAT TATTTTTXAG ATTTGAATAT 
TAACTTGGAT TACATTTTGA AATCAGTTCA 
AGAAAOACAG AAAAGATTAA GGGACG6GCA 
ATAACTTGGT GAAACTGAAA AAG TATATCA 
TATTAATATT TTAGAAAATA TTCCTTTTOT 
ATATTATCAT ACTTATCaTA ATGTTCAATT 
CCTATTCACT GTGCTTA6TA GTGACTOCAT 
CTAAAC06 



TGTCTTTGGA GGTTTGAAAA 6CAGAGAT6A lOBO 

AAAGAAATTT GACXTTGCACC AGTTGATAAC 1140 

AGGATTTGAO CTGCTCAArr CAI3GACAAA6 1200 

AAAGTGGCAG GAGGT C TGTG TTGTCATGGT 1260 

CATCTGAAAT CATGTATCTG TCTCACAAAT 1320 

ATA6AACCCT TATAAAGAAT TATTAACCTT 1380 

GGAATTAGTA TAATAACAAT GTTAATATTT 1440 

ATCTTTTAAG AAAACATACA CTTGGATTTC 1500 

AACCAGCT6C TGCAGATATA TAACTCAAAA 1560 

CATCTCCCAG AGTAATATTC ACTTAACACA 1620 

AAATGTATTT ITTAAACACT TQTIATGA6T 1680 

TTCCATGATG CATATTACIG GATTAjC»rrA 1740 

CATTTTTCAA CGATTAAGRA TCATCATTAC 1800 

TATGGGTACA CAAGGCTATT TGCCAGCATA 1860 

AATACTGAAT ATAAACATAG AGCTAGA67C 1920 

TGATACAGTA GAATTGCAA6 TCCXnTAAGTC 1980 

TTAATAAAAA GT6TTTTTAG TTTTTAACAA 2040 



Seq ZD NO: 101 Protein sequence: 
Protein Accession I: NP_000664 



1 11 21 31 41 51 

I I i I 1 1 . 

MGTAGKVIKC KAAVLWEQKQ PFSIEEIBVA PPKTKBVRIK ILATGICRTD DHVIKGTMVS 60 

KFPVIVGHBA TGIVESI6EG VTTVKPGDKV IPLFLPQCRE C31ACRNPDGH LCIKSDITGR 120 

GVLADGTTRP TCKGKPVHHP KNTSTFTBYT WDBSSVAKI DDAAPPEKVC LIGOOPSTGY 180 

GAAVKTGKSrK PQSTCWF6L OSVGIiSVIMS CKSAGASRXX GZDXiNKDKFB KAMAVGATBC 240 

ZSPKDSTKPX SSVLSENTGN NVGYTFEVIG HLEIWDALA SCHMNYGT8V WGVPPSAKM 300 

LTYDPHLLPT GRTWKGCVFG GLKSRDDVPK LVTSPIAKKF DLOQLZTHVL PFKKISBGFB 360 
LLNSGQSXRT VLTF 

Seq ID NO I 102 DNA sequence 

Nucleic Acid Accession ft* HK.0067e3.1 

Coding sequence: 1..786 



1 11 21 31 41 51 

1 I I I I I 

ATGGATTGGO 66AGQCTGCA CACTTTCATC OGGGGTGTCA ACAAACACTC CACCA8CATC 60 

GGGAAGGTGT GGATCACAGT CATCTTTATT TTCOGAGTCA TGATCCTAOT GGTGGCTGCC 120 

CAGGAA6TGT GGGGTGACGA GCAAGAGGAC TTC3GTCTGCA ACACACTGCA ACOGGGATGC 180 

AAAAATGTGT GCTATGACCA CTTTTTCCOG GTGTCCCACA TCOGGCTGTG GGCCCTCCAG 240 

CTQATCTTOG TCTCCACCCC AOaQCTGCZG OTGGCCATGC ATGTGGCCTA CTACAGGCAC 300 

GAAAOCACTC QCAAGTTCAG GGGAGGAGAO AAOAGGAATG ATTTCAAACA CATAGAGGAC 360 

ATTAAAAAOC ACAAGGTTC6 GATAGAGGGG T0GCTGT6GT GGACGTACAC CA6CAGCATC 420 

TTTTTCOGAA TCATCTTTGA AGCAGCCTTT ATGTATGTGT TTTACTTCCT TTACAATGGG 480 

TACCACCTGC CCTGGGTGTT GAAATGTGGG ATTGACCCCT GCCXXAACCT TGTTGACTGC 540 

TTTATTTCTA GGCCAACAGA GAAGACOSTO TTTACCATTT TTATGATTTC TGOGTCTG TG 600 

ATTTGCAT6C TGCTTAA08T GGCAGASTTG T6CTACCTGC TGCTGAAAGT GTGTTTTAGO 660 

A6ATCAAA6A GAGCACAGAC QCAAAAAAAT CACCCCAATC AIGCOCTAAA GGAOAgXAAg 720 

CAGAATGAAA TGAATGAGCT GATTTCAGAT AfiTGGTCAAA ATGCAATCAC AOOTTTCCCA 780 
AGCTAA 



Seq ID NO I 103 Protein sequence: 
Protein Accession #: NP_006774.1 



1 11 21 31 41 51 

I I I I 1 I 

MDWGTIiHTPl 6GVHKHSTSI GKVWITVIPI PRVMILWAA QBVWGDBQBD PVCMTLQPGC 60 

XNVCniHFFP VSHIRLWALQ LXFVSTPALIi VAKHVAVYRE ETTRXFRRGB KRNDFKDIED 120 

IKKHKVRIEG SLKWTYTSSI PPRIIPEAAP MYVFYFLYNG YHLPWVLKOQ IDPCPNLVDC 180 

FXSRPTEKTV FTIFMZSASV ICMLLNVAEL CYLLLKVCFR RSKSAQTQKK HFNBALKBSK 240 
QNENNELISD SGQNAITGPP 8 



Seq ID NO: 104 TSNA sequence 
Nucleic Acid Accession #: NM_020411 
Coding sequence: 66>526 



1 11 21 

I I i 

GGACCTG6GA AGGAGCATAO GACAGGGCAA 
AAGGCAOOAO GGAACCTCAC TGCGCAT6CT 
ACTGGQOGTC TTCCCATOGG CCCCTTOGCC 
GGCGACTCGG GTCCCTGAGG TCTGGATTCT 
ACAAACACAG AACCACACAG CCAGTCCCA6 
GAACCAGCAG CTGAAAGTCO 6GATCCTACA 
ACAGCT6AGA TCCCAGTQCQ CGACATGGAA 
AC05GGGATA AATCTGGATT TGGGTTCCGG 
ACACTGTAAA ATGCCAGAAO CAGGTGAAGA 
AAACAA06CA A6CIG6TTTT ATATTAGATA 
CA6CTTTCAC CAAAAAAAAA AAAAAA 



31 41 51 

t 1 I 

GGCGGGATAA GGAGGGGCAC CACAGCCCTT 60 

CCTTTGGTGC CCACCTCAOT GCGCATGTTC 120 

AGTGTGGOQA A0GCG3CGGA GCTGTGAGCC 180 

TTCTCCGCTA CTGAGACAOS G06GACACAC 240 

GAGCCCAGTA ATGGAGAGCC CCAAAAAGAA 300 

CCTGGGCAGC AGACAGAAGA AGATCAGGAT 360 

6GT6ATCTGC AAGAGCTGCA TCA6TCAAAC 420 

CGTCAAGGTG AAGAXAATAC CTAAAGAG6A 480 

GCAACCACAA GTTTAAA3GA AGA CAAGCTG S40 

TTTGACTTAA ACTATCTGAA TAAAGTTTTG 600 



Seq ID NO: 105 Protein sequence: 
Protein Accession 9: NP_065144.1 

1 11 21 31 41 SI 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 
I I I I I I 

NIiLNCPFQa CSIiGVFPSAP SPVWatliRSC BPA3KVP8VN ILSFbUIBOO BTQIQHBTAS 
KtSFVMESPK KKHQQUCVOI ISLGSROXKI RIQLRSQCIVT t«mCKSCI8 QltGWUSUB 
SOVXmiPK EESCKNPBKS BSQPQV 

Seq ID MOi 106 (equenes 

Nucleic Acid Aceassiea 8t J04129 

Coding sequence: 99-S67 



PCT/US02/12476 



60 
120 



1 
I 

CATCCCTCTG 
TCACCCTX3G6 
AGGftCCTGGA 
ACATCTCCCT 
CCACCOCOGA 
AGAA6AAGGT 

tggcgaa06a 
agg;icaccac 
aggac6at6a 

GGTACTTGCT 
CCA6GAAGAC 
TTTCAAAGAA 
TCCTGCTGCA 
GCAGAGGTTA 



I 

GCTCCAQAGC 
OGTGGCCCTG 
GCTCCCAAAG 
CATGGC GACA 
G6ACAACCTG 
CCrrGQAQAQ 
GGCCAOGCTO 
CAOCCCCATC 
GATCATGCAG 
GGACITGAAA 
CAGACTCCOV 
T3\ACCACAGC 
CACCTGCACX: 
TTAATAAACC 



21 
I 

TC AGftOC CAC 
GTCTGTGGTQ 
TTGGCAGGGA 
CTG AAC3GCCC 
6AGAT0STTC 
AAGACTOGGA 
CTOQATACTG 
CAGAGOVTGA 
GGATTCATCA 
CAGATGQAAG 
CCCTTCCACA 
TCA6AAGAC0 
ATTGCCATGG 
CTTGGAGCAT 



31 
I 

CCACA6C06C 
TCCCGGCCAT 
CCTGGCACTC 
CTCTGAGGGT 
1GCACAQAT0 
ATOOVAAGAA 
ACTACGACAA 
TGTGCCAGTA 
GGGCTTTCAG 
AOCOG TGCOG 
CCTOCAQAGC 
AT(3U3GTGGT 
GGAOGCTGCr 
6 



41 
I 

AGCCAT6CTG 
GGACATCXXX 
CATGGCCATG 
CCACATCACC 
G6AGAACAAC 
6TTCAA6ATC 
TTTCCTGTTT 
CCTGGCCAGA 
GCCCCTGCCC 
TTTCTAGCTC 
AGTGGGACTT 
CATCTGTGTC 
00CTG6G6GC 



51 

I 

T0CCT0CT6C 
CAGACCAAGC 
GCGACCAACA 
TCACTGTTGC 
AGCTGTGTTG 
AACTATA0G6 
CTCTGCCTAC 
GTCCTGGTGG 
AG6CACCTAT 
ACCTGOQCCr 
OCTOCT OO CC 
6CCATGCGCT 
A6AGTCTCT6 



Seq ID HO: 107 Protein eequexLcei 
Protein Accession ftt AAA60147 

1 11 21 31 41 51 

I I I I I I 

MDIFQTKODL BLPKLAGTHH SMAMATNKIS LHATLKAPIiR VHITSUiPTP EDNIiEIVXAR 
WENNSCVEKR VLGBKTGNPK KFKIMYTVAN EATI>l4DTDYD NFLFIiCLQDT TTPIQSHMOQ 
YLARVLVEDD BIKQGPIRAF RPIiPRHLWYL LDLKOMEEPC RF 

Seq ID NO: 108 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 48-794 



TCOCAGGCAO 
GTCTGATCOV 
TCATGAAAGG 
CAGTAGCCTA 
TTGAGCAGAA 
GOOAGAAGGT 
6CCACCTCAT 
GTGACTACTA 
ACTCAGCCCQ 
CCAACCCXIAT 
ACAQCCCOGA 
TGCACACGCT 
ACAACCTGAC 
AGOCCCAGAG 
TGCOGAGAGG 
CTCCAAAGGG 
CACTCTTCTT 
CGCACCCGCT 
CTGCCCCTGC 
6GACAGTGGC 
06GG0GGG0C 
TTCCTCTCAA 



11 
1 

CASTTAGCCC 
6AAGGCCAA6 
CGCCGTGGAG 
TAAGAACGTG 
AAGCAACGAG 
OGAGACTGAG 
CAAGGAGGCC 
COGCTACCTG 
GTCAGCCTAC 
COGCCTGGGC 
GQAOGCCATC 
CAG0GA66AC 
ACIGTG6A08 
CTGAGTGTTG 
ACTAGTATGG 
CTOOGTGGAG 
GCAGCTGTTO 
TCCTCCCGAC 
TGCCTCTGAT 
A0GG6CTGGA 
AGTGCAASAC 
TAAAGTTCCC 



21 
1 

GCGGCC06CC 
CTOGOUSAGC 
AAGG6CGAQG 
GTGGGGGGCC 
QAGGGCTCGG 
CTCCAGQGCG 
6GG6A0600G 
GCC6AGGTG0 
CAGGAGGCCA 
CTGGCCCIGA 
TCTCTG6GCA 
TCCTACAAAG 
GCCGACAAOG 
CCCGCCACaS 
GGTGGGA6GC 
AG6GACTGGC 
AGOGCACCTA 
CCCAGGACCA 
06TAGGAATT 
GATGGGTGTQ 
OGAGATTGAQ 
CTGTGACACT 



31 

I 

TGTGTGTCCC 
AGGOOQAACG 
AGCTCTCCTG 
AGAGGGCTGC 
AGGAGAAGGG 
TGTGCGACAC 
AGAGCOGGOT 
CCAOOGGTGA 
TGGACATCAG 
ACTTTTCCGT 
AGACCACTTT 
ACAGCACCCT 
OOGGGGAAQA 
CCCCGCCCT6 
CCCACCCTTC 
AGAGCTGA66 
ACCACTGGTC 
GGCTACTTCT 
GAGGAGTGTC 
TOTGT G TGT G 
GGAAA6CAT6 
C 



41 

I 

CAGAGCCATO 
CTATQAGGAC 
CGAAOAGCGA 
CTGGAGGGTG 
GCCOGAGGTG 
OGTGCTGGGC 
CTTCTACCTO 
GGACAAGAAG 
CAAGAAGGAG 
CTTCCACTAC 
CGAOQAGGCC 
CATCAT6CAG 
GGGGGGOGAG 
CCCCCTCCAG 
TCCCCTAGGC 
CCAGCTGGG6 
ATGCCCCCAC 
CCCCTCCTCT 
GOGCCTTGTG 
TGTGT6TGTG 
TCTGCTGG6T 



51 
I 

GAGAGAGCCA 
ATGGCAGCCT 
AACCTGCTCT 
CTGTCCAGTA 
CGTGAGTACC 
CTGCTGGACA 
AAGATGAA6G 
CGCATCATTG 
ATGCCGCCCA 
GAGATCGCCA 
ATOQCTGATC 
CTGCTGCQAG 
GCTCCCCAGG 
TCCCCCACCC 
6CTGTTCTTG 
CXGG66ATCC 
0CCT6CTCTC 
TGCXTPCCCTC 
GOTGAGAACT 
TGTGTGTGTG 
GTGACCATGT 



Seq ID NO: 109 Protein sequence: 
Protein Accession Ui NP_006133.1 

11 



31 



41 



51 



60 
120 
160 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
760 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 



1 11 21 

I 1 1 I I I 

MERASIilQKA KLAEQAERYB DMAAFMKGAV EKGEELSCEE RNLLSVAYKK WGGQRAAHR 60 
VLSSIEQKSN EBGSEERGPB VREYREKVET ELQGVCDTVL GLLDSELIKB AGDABSSVFY 120 
LKHKODYVRY LAEVATGDDR KRIIDSARSA YQBAMDISKK EMPPTNPIRL GItALNPSVFB 160 
YfilANSPEEA ISLAKTTFDB AMADLBTXiSE DSYKDSTIflM QLLRDNLTLM 1ASNA6EBG6 240 
EAPQEPQS 

Seq ID KO: 110 DNA sequence 
Nucleic Acid Accession St NM_000695 
Coding sequence: 407-1564 

1 11 21 31 41 51 

I I I I I I 

aVOGASlTGG TTT6G6AQCT GCCAOTCTCC T6G6AG6ATC GCAGTOVGCA 6AGCAGG6CT 60 

6AGGCCT6GG GGTAGGAGCA GAOOCTGCOC ATCT66AGQC AGCATGTCCA A6AAA66GA0 120 

TGGAGGTGCA GOGAAGGAOC CAGGGGCAGA GCCCAOGCTG GGGATGGACC CCTTCGAGGA 180 

CACACTGCGQ C GG CHGOGIX; AGGCCTTCAA CTGAGGGOSC AOGCGGCOGG CCGAGTTCOG 240 

GGCTGOGCAG CTCCAGGGCC TG6GCCACTT CCITCAAGAA AACAAGCAGC TTCTGCGOQA 300 



230 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

OGnSCTGGOC CAGGACCTGC ATAAG0CA6C TTTGGAG6CA 
TTGOCAtSUVC GAGGTTGA.CT AOGCTCTCAA GAACCTTCAG 
AOGGTCXAOCS AACCTGTTCA TGAAGCTGGA CTCGGTCTTC 
CCTGGTCCTC ATCATOjCAC CCTGGAACTA CCCATTGAAC 
GGGCAOCCTC CC06CA6GGA ATTGOGT GG T GOGtiAGCCG 
AGAGAAOGTC CTG6CT6AG6 TGCTGOCOCA GTACCTGGAC 
GCTGGGOGGA CCCCAGGASA CAGG6CA6CT GCTAGAGCAC 
CACAGGGAGC CCTC6TGTOG GCAAGATTGT CATGACT6CT 
T6TCACCCTG GAGCT6GGG0 6CAASAACCC CTGCTAOGTG 
GAO OS ltSGCC AACOGOST06 CCT6GTTCTG CTACTTCAAT 
OCCIGACXAC GTCCTGTGCA GCC0C6AGAT GCAGQA CftGG 
CACCATCACC 0STTTCTAT6 GOGACGAOOC CCAGAGCTCC 
CAACCAGAAA CAGTTCCAGC GGCTGCGGGC ATTGCTGGGC 
GGGCCAGAGC AACX3AGAGCG ATCGCTACAT CGCCCCCAOG 
GACGGAGCCT GTGATGCAGQ AGGAGATCTT C3GGGCCCATC 
GAGCXSTGOAC GAGGCCATCA AGTTCATCAA CCG6CAG6AG 
CTTCTCCAAC AGCAGACA6G TTGTGAACCA GATGCTG6A6 
TGGAGGCAAT GAGGGCTTCA CCTACATATC TCTGCTGTCC 
CCACAGTGGG ATGGGCCGGT ACCACGGCAA GTTCACCTTC 
CACCTGCCTG CTCGCCCCCT CCGGOCTGGA GAAATTAAAG 
TACX3GACIOG AACCAGCAGC TGTTAOGCTG 6GGCATGGGC 
GTGA80GT0C CACCOQOCTC CAAOGGGTCA CACA6AGAAA 
GCTTATGCrC CCAACTCAC» TTGTTOCTCC AGACOGCAGG 
T6GAGCTGTC ACATGACTGC ATCCTGCCTG CCRGGGCTGC 
TCTGGGGGAC GCTGCTOGAG AGAGGCXX»G AGGCOGCAGA 
CACCCCAOCC TCCCCAATTC CAGOOCTTTG CCCTCTCGGT 
CACAGGGGCA GTGTCACOCT GGAAAATACA GTGCCCFGCC 
OAACQOTTOA GAG06T6GA0 CCCTGCAGGC CTTTGCTCTC 
TTCCAOCTCT GCCOCATCCC AACrOCACX» GCACTGCXTTC 
CCCACACTGG TCTCTGCACC ACCOCTCTGG TTCACACCGC 
AGCTCCATCC ACTGGGAAAA CTGGGGTTTG C3VTCACTCCA 
CTGGGGGC3UI GTCXX:TTGAC TTCTCTGAGC CTCAGTTTCC 
CCAAAATQQA GTCACTTATG CCAAACTCTA ATAAAATGGA 
CCCTCACACA CACATGCCGO TAACAGGATT TATCACCAAG 
A6ACACA666 CGTATGGAAA AGOVOGTCXTT CAAAGACTGT 
GATGCTTACC TACXaOGGCC GTCTCCACCA GAAAACCATC 
TGTGACTTAC AAACCTTGTT TAAAAOCTGC TTACATGGAC 
COCXTGGCTG TGGOCCTCTO TOTATOCCTG OGATCCTTOC 
GGAATCCTCT OCTCCTCCCA AATAAATTCA TCTGTTC 



Seq ZD NO: 111 Protein sequence: 
Protein Acceseion ft: NP 000686 



PCT/US02/12476 



1 

1 

MKDEPRSTHL 
XSQGTEICVLA 
KHLTPVTLEL 
PALQSTITRP 
VDVQETEPVM 
SSGSFOGNBG 
BYPPYTDHNQ 



11 

) 

PMKLDSVFIW 
EVLFQYLDQS 
GGKNPOrVDD 
YGDDPQSSPN 
QEEIFGPILP 
FTYISLLSVP 
QUjRNOeGSQ 



21 

I 

KEPFGLVLZI 
CFAWLGGPQ 
NCDPQTVANR 
LGRIINQKQP 
IVNVQSVDEA 
PGGVCSSGMG 
SCTLL 



31 
1 

APMNYPX^T 

VAWFCyPNAS 

QRIiRALLGCG 
IKPINRQEKP 
RYHGKPTFDT 



GACATATCTG 
GCCTGGATGA 
ATCTGGAAGG 
CTGACCCTGG 
TCAGAAATCA 
CAGAGCTGCT 
AAGTTGGACT 
GCCACCM6C 
GAOGACAACT 
GCOGGCCAGA 
CT6CTGCC06 
CCAAACCTGG 
TG06GCC6CG 
GTGCTGGTGG 
CT6CCCAT0G 
AAGCCCCTG6 
G6GACCAGCA 
GTGCCATTOG 
GACACCTTCT 
GAGATCCGCT 
TCCCAGAGCT 
CCT6AGTCTA 
CTCCCCCAGC 
AAAGCAAGGT 
ACATG0CA6G 
CAGGGTTGGC 
TTCTTAGGGG 
CCCTCTAGGC 
CCCCAGGGAT 
ACCCTQCACT 
CTGCACAGTG 
TTATGTGAAA 
6T0SG6QGQ6 
AGAOGOCTGC 
AGTATTCCAG 
GCCAACTCCT 
TTCTGTCCrr 
AAGCACTCAT 



41 

1 

LVLLVGTIiPA 
DTIFFTGSFR 
QTCVAPDYVL 
RVAIGGQSNE 
LALYAFSKSR 
PSBHRTOiLA 



Seq ID NO: 112 DNA sequence 
Nucleic Acid Accession #: NM_0044S6 
Coding sequence: 58-2298 



1 

! 

GAATTCCGGG 
GGCCAGACTG 
6AGTACAIGC 
TTTAGTTCCA 
CAGOQAAQGA 
GAGTGTTCGG 
AAlt3CAGTT6 
OIGGAAGATO 
GAT6GTACTT 
GAATGTGGGT 
AATGATGATG 
GATCTGGAGG 
AAAATTTTG6 
GAAAAATATA 
CCCAAOVTAG 
CATA06CTTT 
ACACCCAACA 
CCACAGTGTT 
CGGATAAAGA 
' AGTAGCAGGC 
AGGGAAGCAG 
GATGAAACTT 
CCAAATATTG 
GTCCTCATTG 
ACATGTAGAC 
GCTGAGGAT6 
CACTGCAGAA 



11 

I 

CGACGCGCGG 
G6AAGAAATC 
GACTGAQACA 
ATGGTCAGAA 
TACAGCCTGT 
TGACCAGTGA 
CTTCAGTACC 
AAACTGTTTT 
TCATT6AA6A 
TTATAAATGA 
AOGATGATGA 
ATCAC06A6A 
A060CATTTC 
AAGAACTCAC 
ATGGACCAAA 
TCTGTAGGCG 
CTTATAAGOG 
ACCAGCATTT 
CCCCACCAAA 
CCAGCACCCC 
GGACTGAAAC 
CXSAGCTCCTC 
AACCTCCTGA 
6CACTTACTA 
A66TGTATGA 
TGGATACTOC 
AGATAQ^C^T 



21 

1 

GAACAACGCG 
TGAGAAGG6A 
GCTCAAGAG6 
AATTTTGGAA 
GCACATCCTG 
CTTGGATTTT 
CATAAT6TAT 
ACATAACATT 
ACTAATAAAA 
TGAAATTTTT 
TGATGGAGAC 
TGATAAA6AA 
CTCAATGTTT 
OGAACAGCAG 
TGCTAAATCT 
ATGTTTTAAA 
GAAGAACACA 
GGAGGGAGCA 
A06TCCAGGA 
CACCATTAAT 
GGGGGGAGAG 
TGAAGCAAAT 
GAATGTGGAG 
TGACAATTTC 
GTTTAGAGTC 
TCCAAGGAAA 
GAAAAAGQAC 



31 
I 

AGTCGGCGOG 
CCAGTTTGTT 
TTCAOAOGAG 
AGAAGOGAAA 
ACTTCTGTGA 
CCAACACAAG 
TCTTGGTCTC 
OCTTATAXGO 
AATTATGATO 
GTGGAGTTG6 
GATCCTGAAG 
AGCOGOXAC 
CCAGATAAOO 
CTCCCAG608 
GTTCA6AGAG 
TATGACTGCT 
GAAACAGCTC 
AAGaAGTTTG 
GGC08CA6AA 
GTGCTGGAAT 
AACAATGATA 
TCTCGGTGTC 
TGGAGTGGTG 
TGTGCCATTG 
AAAGAATCTA 
AAGAAGAGQV 
GGCTCCTCTA 



41 

I 

OOGGAOGAAO 
GQOGGAAGOG 
CTQATSAAiGT 
TCTTAAACCA 
6CTCATTG0G 
TCATCCCATT 
GCCTACAGCA 
GAGATGAAOT 
06AAAGTACA 
TGAATGCCCT 
AAAGAGAAGA 
CTCGGAAATT 
QC3^CAGCAGA 
CACTTCCTCC 
AGCAAAGCTT 
TCCTACATCC 
TAGACAACAA 
CroCTGCTCT 
QAGGACX^GCT 
CAAAGGATAC 
AAGAAGAAGA 
AAACACXZAAT 
CTGAAGCCTC 
CTAQGTTAAT 
GCATCATAGC 
AACACCGGTT 
ACCATGTTTA 



ABCTCATOCT 
A0GAT6AA0C 
AACOCTTTGG 
TGCTOCTGGT 
GOCAGGGCAC 
T T G OOSTGgT 
ACSVTCTTCTT 
ACCTGAOGCC 
GOGACCCCCA 
CCTG06TGGC 
OOCTGC AGAG 
GC06CATCAT 
TGGCCATTGO 
AOGTGCAGGA 
T6AA0GTGCA 
0CCIGTA06C 
6CGGCA6CTT 
GGGGAGT06G 
CCCACCACCG 

AcccacxxrrA 

GCACCCTCCT 
GCCATGAGGG 
CTCAGGTTGC 
CTTQCrrCTA 
TGTCCTCACT 
CA6GCCCAGT 
CATCAGCCCT 
ACAGGOGCAC 
CCTCTCACAT 
CACCCACAGC 
TTAGTGGGAC 
GTTGCTGGAA 
CACATAQAAO 
ATGTAA6ACC 
ATGAGCTGCA 
GCGATCAGCT 
TAAAACGTTC 
AGOCCAOATA 



51 
I 

GNCWLKPSE 
VGRIVKTAAT 
CSPEMQERLL 

SDRYIAPTVL 
QVVNQMLERT 
PSGLEKLKEX 



51 
1 

AATAATCATG 
TGTAAAATCA 
AAA6AGTAT6 
AGAATGGAAA 
CGGGACTAGO 
AAAGACTCTO 
GAATTTTATO 
TTTA6ATCA6 
0GGGGATA6A 
TGGTCAATAT 
AAAGCAGAAA 

TCxrrrcTGAT 

ASAACTAAAO 

TC^UITGTAOC 
ACACTCCTTT 
TTTTCATGCA 
ACCrrGTGGA 
CAOOQCTGAG 
TCOCAATAAC 
A6ACAGTGAT 
AGA6AAGAAA 
AAAGATGAAG 
AATGTTTAGA 
TGGGACX3VAA 
TOCAGCTCCC 
CTGGGCTGCA 
CAACTATCAA 



360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 



60 
120 
180 

240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
lOBO 
1140 
1200 
1260 
1320 
13B0 
1440 
1500 
1560 
1620 



231 



wo 02/086443 

CCCrCTGATC ATCCA0G6CA GCCTTGTGftC 
TTTTBTGAAA AGTTTTGTCA ATGTAGTTCA 
TGCAAAGCAC AGTGCAACAC CAA6CAGTGC 
CCTGACCTCT GTCTTACTTG TGGAGCOGCT 
AAGAACTGCA GTATTCAGC6 GG6CTCCAAA 
(3CA6GCTGG6 GCSATTTTTAT CAAAGATCCT 
TGTGGA6A6A TTATTTCTCA AGAT6AAOCT 
ATGT6CAGCT TTCTGTTCAA CTTGAACAAT 
AACAAAATTC GTTTTGCAAA TCATTOGGTA 
6TTAACGGTG ATCACAGGAT AGGTATTTTT 

eAumviixa aitacagata cagccaogct 

GAAATG6AAA TCCCTTGACA TCTGCTACCT 
CTTCAGGAAC CTOCSAGTACT GTGGGCAATT 
AATTTGCAAA GTACTGTAAG AATAATTTAT 

GcxrrrcTCAC cagctgcaaa gtgttttgta 

TACATTTTTC AACTTTGAAT AAAGAATACT 



AGTTOGTGCC CTTGIGTGAT AGCACAAAAT 1680 

GA6T6TCAAA AC06CTTTCC GGGATGC06C 1740 

CCGTGCTAOC TGGCTGTCOG AGA G TGTGAC 1800 

GACCATTGGG ACAGTAAAAA TGTGTCCTGC 1860 

AAGCATCTAT T6CTGGCACC ATCTGAGGTG 1920 

GT6CAGAAAA ATGftATTCAT CTCA6AATAC 1980 

GACAC3AAGAG GGAAAGTGTA TGATAAATAC 2040 

GATTTTGTGG TGGATGCAAC C06CAAGGGT 2100 

AATCCAAACT GCTATGCAAA AGTTATGATG 2160 

GCCAAGAGAG CCATCCASAC TG606AAGAS 2220 

GATGCCCT6A AGTATGTGGO CATC6AAAGA 2280 

CCTOCCCCTC CTCTGAAACA GCT6CCTTA6 2340 

TAGAAAAAGA ACATGCAC3TT TGAAATTCTG 2400 

AGTAATGAGT TTAAAAATCA ACTTTTTATT 2460 

CCAGTGAATT TTTGCAATAA TGCAGTAT6G 2520 
TGAACTTGAA AAAAAAAAAA AAAAAA 



Seq ID NO: 113 Protein sequence: 
Protein Accession fi: NP_004447 



1 11 21 31 41 51 

I I I I I I 

MGQTGKKSEK GPVCWRKRVK SEYMRLRQLK RFRRADSVXS MFSSNRQKIL ERTEIliNQEW 60 

KQRRIQPVHI LTSVSSLRGT RECSVTSDIiD FPTQVIPUCT LNAVASVPIM YSWSPLQQNF 120 

r4VSD£TVIiHN IPYMGDEVLD QDGTFIEELI KNYDGKVHGD RE06PINDEX FVEIiVHALGQ 180 

VNDODDDDDG DDPEEREBKQ KDLEDHRDDK ESRPPRKFPS 0KILEAI5SH FPDKGTAESL 240 

KEKVKELTGQ QLPGALPPEC TPNIDGPNAK SVQREQSLHS PHTLPCRRCF KYIJCFLHPPH 300 

ATPNTYKRKH TETALDNKPC GPQCYQHLBO AKEFAAALTA ERIKTPPKRP GGRRRGRLPN 360 

NSSRPSTPTI NVLESKDTDS DREAGTBTGO ENNDKEEEEK KDETSSSSEA NSRCQTPIKM 420 

KPNIEPPENV EWSGAEASMP RVLIGTYYDN PCAIARLIGT KTCRQVYEFR VKESSIIAPA 480 

PAEDVDTPPR KKKRKHRI.HA AHCRKIQLKK DGSSNHVYMY QPCDHPRQPC DSSCPCVIAQ 540 

NFCEKFCQCS SBCQNRFFGC RCKAQOmOQ CPOTLAVREC DPDLCLTC6A ADHWDSKKVS 600 

CKNCSIQRGS KKBLLLAPSD VA6HGIFIRD PVQKNEFISE YCGSZZSQDE ADRRGKVYDK 660 

YMCSFLFNLN HDFWDATRK GSKIKEKSBS VNPNCYAKVM MVI»a)BRIGX FARRAIQTGE 720 
ELFVDYRYSQ AXUOiKyVGIB REHEZP 

6eq ZD M0< 114 DHA sequence 
KucXeic Acid Accession »i NM_001827 
Coding sequence: 96-335 



1 11 21 31 41 51 

I I I I I I 

AGTCTCOGGC GAGTT6TTGC CTGGGCTG6A 06TGGTTTTG TCTQCTG08C COGCTCTTOS 60 

OGCTCTOGTT TCATTTTCTG CAGCGCGCCA CX3AGGATGGC CCACAAGCAG ATCTACTACT 120 

OGGACAAGTA CTTC3GACGAA CACTAOGAOT ACCGGCATGT TATGTTACCC AGAGAACTTT 180 

CCAAACAAGT ACCTAAAACT CATCTGATGT CTGAAGAGGA GTGGAGGA6A CTTGGTGTCC 240 

AACAOAOTCT AG6CT00GTT CATTACATGA TTCATGAGGC AGAAOCACAT ATTCTTCTCT 300 

TTA6A0QACC TCTTCCAAAA GATCAACAAA AATSAAGTTT ATCTGGGGAT (XntAAATCT 360 

TTTTCAAATT TAATGTATAT GTGTATATAA GGTAGTATTC AGTGAATACT TGAGAAATGT 420 

ACAAATCTTT CATCCATACC TGTGCATGAG CTGTATTCTT CACAGCAACA GAGCTCAGTT 480 

AAATGCAACT GCAAGTAGGT TACTGTAAGA TGTTTAAGAT AAAAGTTCTT CCAGTCAGTT 540 

TTTCTCTTAA 3T00CTOTTT GAOTTTACTG AAACAGTTTA CTTTTOTTCA ATAAAGTTTO 600 
TATGTTGCAT TTAAAAAAAA AAAAAAA 



Seq ID NO: lis Protein sequence: 
Protein Accession #> NP_00181B 



1 11 21 31 41 51 

t i I t I I 

MAHKQIYYSD KVFDEHYEYR HVMLPRELSK QVPKIBIiHSE BEHRRLGVOQ SLGWVEyHZH 60 
EPEFHILLFR RPLPKDQQK 



Seq 10 NO: 116 DNA sequence 

Nucleic Acid Accession §: CAT cluster 



1 11 21 

I I I 

TCA6ACCTCA T6AGTCACTT 66ACTCTTGA 
GCATCTGGAC CCTTGGTGCT ATCGAOGAAG 
AGAGGTGTGT TCCAGGGAAA GCCCCTATCT 
GCA6CCAACA GAGTTCAAAA TGCAGGCTTG 
AAOSAC TGAT C CACATT CCC ACCAOGAAGT 
CCTTGGAA6G ACCIGGCTCA OGCTGGACCA 
TCAAGAATTC TTTGCTQAGC ATGGTGCCTC 
AGTGTGGGAG GATCTCTTGA GCCCAGGAGT 
CCCATCTCTA AAATAATAAT AATAATAAAA 
CCTGTASTTC CAGCIACCCA GGAOGCTGAG 
AGGCTGCAAT GAACTOTGAT TACCCCACTG 
CCTGTCTCAA ATAATAATAA TAATAATAAT 
TTTGAGGTGC CATTTGGGTA GAAAGAAAAG 
CCTGAAGGAG CAGAGGGATG CATCGCTGGA 
GACAGACCTT GTCCTTCTTC CTTGTGGAAA 
CTCTTCCCCC TCCCT6TCCC AGGGAACCAA 
CCGCCTCCCA TGTCTGCTGT OCCTTTGTAC 
CAGCCGGATA CAGAGTGAAT AGTTAACCAC 
TCCTGCTCOB T6TAAA6A66 CCAGT6TTT6 



31 41 51 

I ^ 1 • I 

GCCACCTCTG GGGGTGOAGT CTCTCTCCTG 60 

CTTGGGTGGG GCTCTTAGCT GCTATGTGCA 120 

CTCTGCAGAG GTCAAGTGAA AGCGAC6GCC 130 

GAAAGTACAG GGOGCTCrGT QGAGGATGGG 240 

TTAGCAGAAC CCC0GC6TGC CAACTGGACC 300 

CCTCTTGA6A 66GA6GAGCT CTG6ATTTGA 360 

ATGCCTATAA TACCAACACT TTG6GA66CC 420 

TCAAGACTAG CCTGGGCAAC ACAGAGAOAA 480 

TAAAAAATTA GCAGGGCATG GTGGCATGT6 540 

OCAAGAOGAT 6GCTGGAGGC T6G6A3X3TT6 600 

CACTCCA6CC TGG6CAAAA6 AG06AGAGAA 660 

CTTATTTTGG AGAATAAAGA GACCTCTGGA 720 

AOGTTTACAC OGAGAAATAQ TCTGTQTTGC 780 

GGTGACCTAC AGTTGAAGAA GACTCATTAT 840 

OTtf riTOC T C TGCIGCTACT GC TCAT SACA 900 

AGGGCTTTCT ACCACACCCT TTCTTGCOCC 960 

TCA6CAATTC TTGTCTGCTC CATTATCTTC 1020 

ACTTAGGTCA AATAGGATCT AAATTTTTGT 1080 

T6TGTT6CAA GCAGCCTTGG AATAGTAACT 1140 



232 



wo 02/086443 

CTTCTCATTT GTTTGGGATC TGGCCACCAA GTTCCAGAAT GATACACGGA TCAGTGCAGA 1200 

AGTTCATCAG GCTCTOSGAC CTTAGGGCTG TTGGAGAAGG CTTCAGCAGC AGAACTGATQ 1260 

GTGAAGGCTC GTGTTCTCCA TOCTCAACTT TCTTTGCTTC GATCATACAC AAGAATACAT 1320 

TTGGAAGGGC AAAAAATGAA CACIGTCGTT CATT6CAGCC b W r i TTOTG ACACAOATGC 1380 

ACAGTCTGCT GTGAAGACCT TCTCTCAAGT GGCATrTGGG AGTOCATGGC AGATCATGST 1440 

GCTTCATCAG AQACTGACAG CTATCAGGGG TTGTGGCACT TAGTGAGGAC TCTCXTCCCC 1500 

CA6TGTGTCC TGATGACACA TACACACCTG ACAATAGCTT GAGTCTTCTC TGTTCCTTTT 1560 

ACTCT6TAGC CAACATACAC ATGATTTAAA ACCCTTTCTA AATATCTATC ATGGTTCATC 1620 

CTTGTCCAAA TGCAGAGTCA GAGCTATTTG TACTTCATTA TTATTTCCAA GGOGAATAGT 1680 

TGGCTTTCTT TTTGCAAAAA TAATTAAAGT TTTTOTATGT TGCAAAAAAA AAAAAAAAAA 1740 
AAACAAAAAA 

Seg ID NO: 117 ONA seofuence 

Nucleic Acid Accession #: 6C01217B.1 

Coding sequence: 204-2285 

1 11 21 31 41 51 

I I I I I ) 

CTTCTCTCCC GOGGOGCTGG GGCCC3GCGCT CCGCTGCTGT TGCTCCATTC GGCGCnTTC 60 

TGGOGGCTGG CTCCTCTCCG CXGCOGGCTQ CTCCTCGACC AGOCCTCCTT CTCAACCTCA 120 

GC006CG606 C06ACCCTTC OQ6CA0CCTC CCOCCCOGTC T06TACTGTC GC08TCAC06 180 

CCGCGGCTCC GGCCCTGGOC CGGATG6CTC TCTGCAACTGG AGACTCCAA6 CTGGA6AAT6 240 

CTGGAGGAGA CCTTAAGGAT GGCCACCACC ACTAT6AAGG AGCTGTTGTC ATTCTGGATG 300 

CTGGTGCTCA GTAOGGGAAA GTCATAGACC GAAGAGTGAG GGAACTGTTC GTGCAGTCTG 360 

AAATTTTCCC CTTGGAAACA CCAGCATTTG CTATAAAQGA ACAAGGATTC C6TGCTATTA 420 

TCATCTCTG6 AGGACCTAAT TCr GTOTAT G CTGA AGATSC TCCCIGGTTT QATCCA6CAA 480 

TATTCACTAT TG6CAAGCCT GTTCTrGGAA T7T6CTATG6 TATGCAGAT6 AT6AATAAG0 540 

TATTTGGAGG TACTGTGCAC AAAAAAAGTG TCAGAGAAGA TGGAGTTTTC AACATTAGTG 600 

TGGATAATAC ATGTTCATTA TTCAGGGGCC TTCAGAAGGA AGAAGTTGTT TTGCTTACAC 660 

ATGGAGATA6 TGTAGACAAA GTAGCT6ATG GATTCAAGGT TGTGGCAOGT TCTGGAAACA 720 

TAGTASCA66 CATAGCAAAT GAATCTAAAA AGTTATAT60 AGCACAOTTC CACOCTGAAG 780 

TT08CCTTAC AGAAAATGGA AAAGTAATAC TGAAGAATTT CCTTTATGAT ATAGCT66AT 840 

GCAGT6GAAC CTTCACCGT6 CAGAACAGAQ AACTT6AGTG TATT06AGAG ATCAAAGAGA 900 

QAGTAGGCAC GTCAAAAGTT TTGOmTAC TCAGTGGTGG AGTAGACTCA ACAGTTTGTA 960 

CAGCTTT6CT AAATOGTGCT TTGAAOCAAG AACAAGTCAT T6CT6T6CAC ATTGATAAT6 1020 

GCTTTATGA6 AAAAOGAGAA AGCCAOTCTO TTCSAA6A6SC CCTGAAAAA6 CTTQ0AA7TC 1080 

AGGTCAAAGT GATAAAT6CT GCTCATTCTT TCTACAATGO AACAACAACC CTAOCAATA7 1140 

CAGATGAAGA TAGAACCCCA CG6AAAAGAA TTAGCAAAAC (TITAAATATG ACCACAAGTC 1200 

CTGAAGAGAA AABAAAAATC ATTGGGGATA CTTTTGTTAA GATTGCCAAT GAAGTAATTG 1260 

6AGAAATGAA CTTQAAACXS^ GAG6AQGTTT TCCTT6CXXA A6GTACT7TA OGGOCTGATC 1320 

TAATTGAAAG TGCATCCCTT GTTGCAAGIG GCAAAGCTGA ACTCATCAAA ACCCATCACA 1380 

AT6ACACAGA 6CTCATCA6A AAGTTGAGAG A6GAGGGAAA AGTAATAGAA CCTCTGAAAG 1440 

ATTTTCATAA AGATGAAGTG AGAATTTTGG GCAGAGAACT TGGACTTCCA GAAGAGTTAG 1500 

TTTCXAGGCA TCCATTTCCA GGTOCTGGCC TGQCAATCAG AGTAATATGT GCTGAAGAAC 1560 

CTTATATTTG TAAGGACTTT CCTC3AAA0CA ACAATATTTT GAAAATAGTA GCTGATTTTT 1620 

CTOCAAGTGT TAAAAAOCXA CATAOCCTAT TACAGAOAGT CAAAGOCTGC ACAACA6AA0 1680 

AGGATGAGGA GAAGCTGATG CAAATTACCA 6TCTGCATTC ACTGAATGCC TTCTTGCT6C 1740 

CAATTAAAAC TGTAGGTGTG CAGGGTGACT GTCGTTCCTA CAGTTACGTG TGTQGAATCT 1800 

CCAGTAAAGA TGAACCTGAC TGGGAATCAC TTATTTTTCT GGCTAGGCTT ATACCTOSCA 1860 

TGTGTCACAA 0G7TTAACAGA GTTGTTTA7A TATTTGGCCC ACXIAGTTAAA GAACCTCCTA 1920 

CAGATGTTAC TCGCACTTTC TTGACAACAG G66TGCTCA0 TACTTTACGC CAA6CTGATT 1980 

TTGAGGCCCA TAACATTCTC AGGGAGTCTG GGTATGCTGG GAAAATCAGC CAQATGCCGO 2040 

TGATTTTGAC ACCATTACAT TTTGATOGGG ACCCACTTCA AAAGCASCCT TCATGCCAGA 2100 

GATCTGTGGT TATTCGAACC TTTATTACTA GTGACTTCAT GACTGGTATA OCTGCAACAC 2160 

CTGGCAATGA GATCCCTGTA GA6GTGGTAT TAAASAT6GT CACTGAGATT AAGAAGA7TC 2220 

CTGGTATTTC TOQAATTATG TATQACTTAA CATCAAAGOC CCCAGGAACT ACTQAGTGGG 2280 
AGTAATAAAC TTCTTGTTCT ATTAAAA 



Seq ZD NOt 118 Protein eequencei 
Protein AccesBlon «a AAH12178.1 

1 11 21 31 41 51 

1 I I I i I 

MALCKGDSRL QIAGGDLKDG HHHYBGAWI LDAGAQYGKV IBRRVRBLFV QSEIFFLETP 60 

APAIKBQGFR AIIIS66FHS VYAS3AFWFD PAIFTIGKFV LGICYGKQMM MXVFGGTVHK 120 

KSVREDGVFN ZSVD23TCSIiP RGLQKESWL I*TKGDSVDKV ADGFKWARS GMIVAGIANE ISO 

SKKLYQAQFH PBVGLTENGK VILKUFLYDI AGCSGTFTVQ NRELEaREl KERVGTSKVl. 240 

VLLS66VDST VCTALU7RAL NQBQfVIAVHI DNGFMRKRES QSVEEALKKL GIQVKVINAA 300 

H5FYNGTTTL PI8DEDRTPR KRISKTLNMT TSPEEKRKIX GDTPVKIANE VIGQxINIiKPB 360 

EVFLAQGTLR PDLIESASLV ASGKAELIKT HEKDTSLZHK LREBGKVIEP LKDFHKDSVR 420 

ILGR£LGI*PS ELVSRHPPP6 PGIAXRVICA EEFYICKD7P ETNNILKIVA DPSASVKKPE 480 

TLLQBVKACT TEBDQEiaJ4Q ITSLHSIdlAF LLPIKTVOVQ GDCRSYSyVC GISSKDBFDH 540 

ESLZFLARLZ PRMCHNVMRV VyZFGPFVKB PPTDVTPTFIt TT6VLSTLRQ ADFEAHiriltR 600 

E8G7A6RZSQ MFVZLTPLBP DRDPLQXQPS GQRSWZRTF ZT8DFMTGZP ATPGHBIPVB 660 
WLXMVTEZK KIFGISftZMY DLTSKPPGTT ENS 

Seq ZD NOt 119 DNA sequence 

Nucleic Acid Accession #: MM_006500.1 

Coding sequence: 27.. 1967 " 

1 11 21 31 41 51 

I t t I I I 

ACTTGOGTCr OQCCCTOOGG CCAAGCATGG GGCTTCCCAG 6CTQGTCTGC GCCTTCTTGC 60 

TOGCCGCCTG CTGCT G CrGT CCTOSCGTOG OGOGTGTGCC OGGAGAGOCT GAGCAGCCTO 120 

OGCCTGAGCT GGTGGAGGTG GAAGTGOGCA GCACAGCCCT TCTGAAGTOC GGCCTCTOCC 180 

A6TCCCAAGG CAACCTCAGC GATGTOSACT GGTTTTCTGT 0CACAAG6AG AAGC36GAOSC 240 
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TCATCTTCrG TGTG06CCAG G60CAGGGCC AQACC3GAACC TGGGC3A6TAC GAGCA6066C 300 

TCAGCCTCCA GGACAGAOGG GCTACTCTOS COCTGACTOl AGTCACCOOC CAAGAOGAGC 360 

GOITCTTCTT GTGCC AGGGC AftGOGOCCTC GOTCCCAGGA G TAOOG CATC CAGCTCOGCG 420 

TCTACAAAGC TCOGGAOGAO OCAA&CATOC A6OTCAAC0C CCTGG6CATC CCT6TGAACA 480 

GTAAGGAGCC TOUXSAGGTC 6CTACCT6TG TAGGGAOGAA C366GTACOCC ATTCCTCAAG 540 

TOlTCrGGTA CAAGAArTGGC OGGCCTCTGA AGGAGGAGAA GAACOGGCTTC CACATTCAGT 600 

CGTCCC3WGAC TGTGGAGTCX3 AGTGGTTTGT ACACCTTGCA GAGTATTCTG AAGGCACAGC 660 

TGGTTAAAGA AGAC3VAAGAT GCCCAGTTTT ACTGTGAGCT CAACTACOSO CTGCCCAGTG 720 

GGAACCACAT GAAGGAGTCC AGGGAAGTCA COGTCCCT G T TTTCTAC006 ACA6AAAAAG 780 

TGTQ6CTGGA AGT6GAGCCC GTCGGAATGC TGAAGGAAGG GGAC06GGTG GAAATCAGGT 840 

GTTTGGCTSA T66CAACCCT CCACCACACT TCAGCATCAG CAA0CAGAAC CCCAGCACCA 900 

GGGMQOCAGA GGAAGAGACA ACCAACGACA ACGGGGTCCT GGTGCTGGAG CCTGCCCGGA 960 

AG6AACACAG TGGGOGCTAT GAAT6TCAG6 CCTGGAACIT GGACACCATG ATATOGCTGC 1020 

TC5AGTGAACC ACAGGAACTA CTGGTGAACT ATGTGTCTGA CGTCX3GAGTG AGTCCCGCAG 1080 

CCCCTGAGAG ACAGGAAGGC AGCAGCXTTCA CXXrTGACXrTG TGASGCAGAG AGTAGCCAGG 1140 

ACCTOQAOTT CCAOTGGCTO AGAGAAGASA CAGACCAGGT GCTG6AAAQ6 GGGCCTGTGC 1200 

TTCAOTTGCA TGACCTGAAA CX3GGAG6CAG G3U3GOGGCTA TOSCIGOS T g G06TCTGT6C 1260 

CCAGCATACC OGGCCTGAAC OGCACACAGC TGGTCAA6CT GGCCATTTTT GGCCCCXCTT 1320 

GGATOOCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTGTCTT 1380 

6TGAAG0GTC AGG6CACCCC OGGCCCACXA TCTCCTGGAA CGTCAACX»3C ACX3GCAA6TG 1440 

AACAAGACCA AGATCCACAO OSAGTOCTGA GCAGOCTGAA TO'CCTOSTG ACCCG6GAGC 1500 

TGTTOGAOAC AGGTOnQAA 1GC3VGG6CCT OCAAOGACCT G6GCAAAAAC ACCA6CATCC 15€0 

TCTTCCT06A GCTGGTCAAT TTAACXavCCC TCACACCAGA CTCCAACACA ACCACT06CC 1620 

TCAGCACTTC CACTX3CCAGT CCTCATACCA GAGCCAACAG CAOCTCCACA QAGAGAAAQC 1680 

TGCOGGAGCC 6GAGAGCCGG GGCGTGGTCA TOSTGGCTG T 6ATTGTGTGC ATCCTCGTCC 1740. 

TGGC3GGTGCT GGGGGCTGTC CTCTATTTOC TCTATAAGAA G6GCAAGCT6 C06T6CAGGC 1800 

6CTCA66GAA 6CA6GAGATC A0GCT6CC0C OGTCTOGTAA GACOGAACTT GTAGTTGAAG 1860 

TTAAGTCAGA TAAGCTCCCA GAAGA6ATGG GCCTCCTGCA GGGCAGCAGC GGTGACAAGA 1920 

G06CTCCXSGG AGACX^GGGA GAGAAATACA TOGATCTGAG GCATTAGCXX: OGAATCACTT 1980 

CAGCTOCCTT C0CT6CCTGG ACCATTCCCA GCTCCCTGCT CACTCTTCTC TCAGOCSU^G 2040 

0CTCCAAAG6 GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACOC CCTTTCAGAG 2100 

G6CCACTGGG TTAGGACXTTG AGGACCTCAC TTGGCCCTGC AAGC06CTTT TCAGG6ACCA 2160 

GTCCACCACC ATCTCCTCCA OGTTGAGTGA AGCTCATCCC AA6CAAGGAG CCCCAGTCTC 2220 

CXX5AGCGGGT AGGAGAGTTT CTTGCAGAAC GTGTrTTTTC TTTACACACA TTATGGCXGT 2280 

AAATACCTGG CTCCT6CCAG CAGCTGAGCT GQGTAGCCTC TCIGAGCTGG TTTCCTCCCC 2340 

CAAAGGCTG6 CITCCACXAT CCAG6T6CAC CACTSAASTG AOGACACACX: GOAOCCAGGC 2400 

GCCTGCTCAT GTTGAAGTGC GC7GTTCACA OCOGCTCCGO A6AGCACCGC A6CGGCATCC 2460 

AGAAGCAGCT GCAGTGTT6C TGCCACCACC CTCCTGCTCG CCrCTTOJA GTCTCCTGTG 2520 

ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGTGCOGG 2580 

6GCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAG6CGGA 660GGGCQ6A 2640 

TCACAAAOTC AGGAOGAGAC CATCCTGGCT AACA08GT6A AACCCTGTCT CTACTAAAAA 2700 

TACAAAAAAA AATTAGCTA6 GCGTAGTGQT TGGCACCTAT AGTCCCAGCT ACT0G6AA66 2760 

CTGAAGCASG AGAATGGTAT GAATCCAGGA 66TGGAGCTT GCAGTGAGCC GAGACCGTGC 2820 

CACTGCACTC CAGCCTGGGC AACACAGCX3A GACTCCGTCT CGAGGAAAAA AAAAGAAAAG 2880 

AOGCGTACCT GOGGTGAGGA AGCTGGGC3GC TGTTTrOGAG TTCAGGTGAA TTAGCCTCAA 2940 

TCCCCGTGTT CACTTGCTCC CATAGOCCTC TTGAT6GATC AGGTAAAACT GAAAG8CAGC 3000 

G06GAGCA6A CAAAGATGAG GTCTACACT6 TCCTTCATGQ GOATTAAAOC TATGGTTATA 3060 

TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC OCTAQAAGOG CXXZAAATGAG 3120 

AGAATGGTAC TTAGGGATGG AAAAC36GGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTGT 3180 

CTGTGTGTAT QCATACATAT GTGIGTATAT ATOGTTTTGT CAGGTGTGTA AATTTGCAAA 3240 

TTGTTTCCTT TATATAT6TA TGTATATATA TATA7GAAAA TATATATATA TATGAAAAAT 3300 

AAAGCTTAAT TGTCCCAGAA AATCATACAT T6CTTTTTTA TTCTACATGQ GTA0CACA6G 3360 

AACCTGGGGG CCTGTGAAAC TACAACCAAA AGGCACACAA AACCGTTTCC AGTTGGCAGC 3420 

AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCA AGCTCTACCA GAGCAGACAG 3480 

CTAGCCTACT TTTCAGGAGC AAAACGTCCC GTATQACGCA GCAOGAAGGG CCTGGCAGGC 3540 
TOTTAGCAQO AOCTATGTCC CTTOCTATGG TTTCC6TCCA CTT 



Seq ID NO I 120 Protein sequence: 
Protein Accession ft: NP 006491.1 



1 11 21 31 41 51 

I I I I I I 

MGLFRLVCAF LLAACCXXrPR VASVPGEABQ PAPELVEVEV GSTAUJCOGL SQSQGNLSHV 60 

DWFSVHKEKR TLIFRVRQGQ GQSEPGBYEQ RLSIiQDRGAT LALTQVTPQD ERIFM3QGKR 120 

PRSQBYRIQL RVYKAPBBPN IQVNPUSIPV NSKBPEBVAT CVGRNGYPIP QVIHYKNGRP 180 

LKEER2IRVHI QSSQTVBSSG LYTLQSILKA QLVKEDXDAQ FYCEIiNYRLP S(a]H»4KESRB 240 

VTVFVFYPTB KVWLEVEPVG MLXEtSRVBI RCLADGNPPP HFSISKQNP9 TKEAEEETIN 300 

DNGVLVLEPA RKEHSGRYEC QAHNIiDTMIS LLSEPQELLV NYVSDVRV6P AAPERQEGSS 360 

LTLTCEABSS QDLEFQWLRB ETDQVIiERGP VLQLBDLKRE AGGGYRCVAS VP8IPGLNRT 420 

QLVKLA2FGP PHHAFRERKV HVKEimVLNI* SCBASGHPRP TISWHVN6TA SEQDQDPQRV 480 

ZiSTLNVLVTP ELLBTGVECT ASNDLGKNTS ZLFLELVNLT TLTPDSNTTT GLSTSTASPH 540 

TRAKST8TER XLPEFBSR6V VIVAVIVCIL VLAVLGAVIiY FLYKKGKLPC RRSGXQEZTL 600 
PPSRfCTELW EVKSDKLPEB MGLLQGSSGO KRAP6DQ6EK YZDIiRB 



Seq ID NO: 121 DNA sequence 
Niicleic Acid Accession «t NM_0ie306 
Coding sequence: 60-671 ' 



1 11 21 31 41 51 

I i i I I I 

ATA6TCTACA CAGAGCTCOC CTTGCTGCCC AGACAA6CTG AAGGAOACA GGAAAAGCCA 60 

rGGMSHCrrC AOCATOCTCC TCCOSCCTC AGGACAACAO TCAAGTOCAC AGAGAAACAG 120 

AAGATGTAGA CTATGGAGAG ACAGA1TTCC ACAAGCAAGA 0G6GAAGGCT GGACTCTTTT 180 

CCCAAGAACa ATATGA6AGA AACAAGTCTT CTTCCTCCTC CTTCTCTTOC TOCTCATCCT 240 

CCTCATCTTC TTCATCCTCC TCCTCCTCAG GTCCTGGGCA TGGGGAGCCT GAOSTTTTGA 300 
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AGGAIGAGCT TCAACTCTAT GGAGATGCTC CTGGAGAGGT GGTACCCTCT GGGC5AATCW5 360 

GACTCCGAAG GAGAGGCTCT GACCCAGCAA GTGGAGAAGT GGAGGCCTCT CAGTTAAGAA 420 

GACTGAATAT AAAfiAAAGAT GATGAGTTTT TCCMTTOGT CXTTCCTCTGC TTP6CCATCG 480 

GGGCCTTGCT OGTOTOTTAT CACTATTACQ CAQACTGGTT CATGTCTCTT GGGGTOGGCC 540 

TOCTCACCTT O6CCT0CCTG GAAAOOGTT G GCATCTACTT OSGACTAGTG TAOOGTATCC 600 

ACAGOCTCCT CCAAGGCTTC ATCCCCCTCT TCCAGAAGTT TAGGCTGACA GGGTTCAGGA 660 

AGACTGACTG AGGCCACTTC CAGGTGGGCA GCAGAGGCAG GCCCCAGTGT GACCACCACT 720 

GOSACCCCTG AGCCCACAAG G6CAGAGCAG CAT7CTGAGA GACGCACAGG AGACCAAGCC 780 

AGACCAATAA ACAGAACACT TTTOCTTCCA TGT6GTCTGA ATGTTGGCAC CA6COC3G06C 840 

AGGGOCATCT CATTTGGGCA GTACTGCTCT GCAACCGA6C TGCAAGGATG GAAGGCAGAG 900 

GGTOSGTGTO GGGCCTGAGG CTTCACAGTA CCTGGACCAG CAGGAAGATT CTGGGAGGTC 960 

ACTGCTCTCA GAGGACAGCA AGGGACCCTG AGCTCTGCAA GCTGTGATCT GTCTGGGTTC 1020 

ATGCTTTTTC TCAAATCCCA GGCTATXTTGC ATGCGCTCTC AGGTGCTACC GAGCCATCCT 1080 

GGGAGAGATG GATGGTCCAC TGCTTTGA6G CAGGGAGOCA T00G6CT66G GOCOCTTGGT 1140 

GAACXTTGATG CAGGTAAGAT GCTGAGGACT AAAACCATTT TTTTTOCAOC CAAAAAAAAA 1200 

GGCAGGAAAA TGATCATCAG AAACTAAATG GCAGCCAGGC ATGGGGGCTC AGGACTGTAA 1260 

TCCTCGCACT TTGGGAGGCT CAGGCTAAGG GTOGCTTGAA GCTGAGAGTT CAAGACCAAC 1320 

CTGGGCAACA TAGTGAGACC CCCATCTCTA CAATTTTTTT TTAATGACCA AATGTGGOGG 1380 

TACATACCTG TACATACCTG OGGTTCCAGC TACTCAAGAG GCTQAGGCAG GAGGACTGCT 1440 

TGAGCCCAGG AGTTCAOGGC TGCA6TGAGG TA08ATCAA0 CCACTGCACT CCAGCCTGGG 1500 
CGACAGAGCA AGATOGTTTC TCTAAAATT 



Seq ID IK): 122 Protein sequence: 
Protein Accession I: NP_060776 

1 11 21 31 41 51 

I i 1 I I i 

KETSASSSQP QDNSQVHRET EDVDYGETDF RKQDGKAGIiF SQBQYGRNKS SSS5FSSSSS 60 

8SSSSSSSSS GPGH6EP0VL RDSLQLYGDA P6EWPSGBS GLRRRG5DPA SGEVEASQLR 120 

RU7ZKXDDEF FE7VLLCFAZ GALLVCYHYY ADHFNSIiGVO IiLTFASIiETV GIYFGLVYRI 180 
HSVIiQGFZPL FQXFSLTGFR KTD 

Seq ID NO: 123 DMA sequence 
Nucleic Acid Accession #: BC022542 
Coding sequence : 243 . . 896 

1 11 21 31 41 51 

t I i I I I 

ACTTGGTCCC A60CGATAAA TCTGQGGCAG 060GCGGTAG GAGCTGOGGG OGGCCABGCC 60 

CCTTGCT60G TOOGCACCTG GG0C0606C6 COOCTCTCGO GOGTCOGGCT TOOGGOGTCC 120 

TGGOGGCTCG GGTGG0GG06 GTrOGG G OQS C060CTGGCT GCrCCTOSQG GOSGGGACGG 180 

GGCTCACGCG C30GGCCC3GCC AOOGCCTTCA C0GCCGCGCX3 CTCTGAOGCC GGCATAAGGG 240 

CCATGTGTTC TGAAATTATT TTGAGGCAAG AAGTTTTGAA AGATGGTTTC CACAGAGACC 300 

TTTTAATCAA AGTGAAGTTT GGGGAAAGCA TTGAGGACTT GCACACSTGC OGTCTCTTAA 360 

TTAAACAGGA CATTCCTGCA GGACTTTATG TGGATGCGTA TGAGTTGQCT TCATTA06AG 420 

AGAOAAACAT AACA6AG6CA GT6ATGGTTT CAQAAAATTT TQATATA6A6 GCCCCTAACT 480 

ATTTGTCCAA GGAGTCTGAA GTTCTCATTT ATGCCAGACG AGATTCACAG TGCATTGACT 540 

GTTTTCAAGC CTTTTTGCCT GTGCACTGCC GCTATCATCG GCC3GCACAGT GAAGATGGAG 600 

AAGCCTCGAT T6TGGTCAAT AACCCAGATT TGTTGATGTT TTGTGACCAA GAGTTCCCGA 660 

TTTTOAAATG CTGGGCTCAC TCAGAAGTG6 CAGCCOCTTG TGCTTTGGAT AATQAQQATA 720 

TAT6CCAAT6 GAACAAGATG AAGTATAAAT CAGTATATAA GAA1t?rQATT CTACAA6TTC 780 

CAGTGGGACT GACTGTACAT ACCTCTCTAO TATGTTCTGT QACTCTOCTC ATTACAATCC 840 

TGTGCTCTAC ATTGATCCTT GTAGCAGTTT TCAAATATGG CCATTTTTCC CTATAAGTTT 900 

TAIOTAGTTA AATGCTTCCT A6AAACCTAA ATAA6ATCTA TTAATTTCTG AC GAGAGG TQ 960 

TTCTTCTAGA ATTAATTACT TTTATCTTTT GTCTTCATTT GTGGOCAAAA TTATtJTTTAC 1020 

TAGAGGAAAT TTGGGATCAT TCTCAGCTAA TTCCAAAATG TAGTGCTCTA TTGCATG(SlT 1080 

CXTTOGTAAT CCTCAAGC3VT CAGATGCCAT AAGGGGAAAC TTAATTCT6C TAAATTAATG 1140 

' riTArmirr gagaagtqac tttatcttca tttoqggtag aaaaattatt tctttatgta 1200 

GTAGAGACAA ATTATTCTGA TTTTGCAAGT ACTTTCAATT TAA6CTACAA ATTGAGAAAA 1260 

GOGITATAAA TAAGAATAAA ATAG6CCAGG CACAGTGGCT CACAOCTGTA ATCCCASCAC 1320 

TTTGGGAGGC 06A0GTGGGC GGATCACCAG AOGTCAAGAG TTTGAGACCA OCTTGGTGAA 1380 

ACCCTGTCTC TACTAAAAAT ACAAAAGTTA GCTGGQOCTG GTQGTGGGCA TCTGTAGTCC 1440 

CAGCTAATTG GAAGGGTSAG GOGGGAGGAT CGCTTGAACC TGGGAGGCGG AGGTTCCAGA 1500 

GAGCCAAGAT CXKIACCACTG CACTACAGCC TGGGCGACAG AACGAGACCC TGTCTCCAAA 1560 

GGAAAAACAA AAAAGAAGAA TAAAATAATT TGGATGAAAA TCATGTTTAT TTAAATAGTA 1620 

ATOTCATGAG ACTATTAAAG ATGTGCCAGA GTTTCAATGA AAATCATTAA AGTAGGACAG 1680 

CTAAGAAATT AATATTAATA TAAAAATTAT TGATAATCTT AAATTATTGA TTATTCCTTA 1740 

A06CACTCCA TTCTCCTTTT ACATTTTATC ATGTTTCTTT TQAATATATG AATTGGCAAA 1800 

GGACTTGATG AAACTGA6TA CIAAGATTT8 GTACAGAGTA TQTCAGGAAG ACAACTCAGA 1660 
TT6CCATTTT AAATAAAGTT GTACATGAAC AAAAAAAAAA AAAAAA 



Seq ID NO: 124 Protein sequence t 
Protein Accession 8: AAH22542 

1 11 21 31 41 51 

I I 1 1 - I I 

MCSBIXLRQB VLKDGFBKDIi LIKVKFGBSI EDLBTCRLLI KQDZPAGLyV DPYELASLRE 60 
HNITEAVMVS E3TPDIEAFNY IiSKESEVLIY AHRDSQCIDC FQAFLFVECR YBRP&SEDGE 120 
ASIWNNTOL U4FCDQAGSR RMIRFRFDSF DXTIEFPIUC CNAHSEVAAP CAUENEDIOQ 180 
NNKKRYKSVY KNVILQVPVO LTVBTSLVCS VTLLITILCS KKKKK 

Seq ZD NOt 125 DNA sequence 

Nucleic Acid Accession St NM_004994.1 

Coding sequence: 20.. 2143 
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1 11 21 31 41 51 

I I i I 1 I 

AGACACCTCT GCCCTCACCA TGAGCCTCTG GCAGCCCCTG GTCCTGGTGC TCCTOGTGCT 60 

GGGCTGCTGC 'mXXTCC C C CCAiSACAGOG CCAGTCCACC CTTGTGCTCT TOCCTQtSAGA 120 

OCIGAGAAOC AATCTCACOG ACAGGCWSCT OGCACAGGAA TAOCTGTACC GCTATGGTTA 180 

CACTOGGGTG GCAGAGATGC GTGGAGAGTC GAAATCTCTG GGGCCTGOGC TGCTGCTTCT 240 

CCAGAAGCAA CTGTCCCTGC CCGAGACOGG TGAGCTGGAT AGOSCCAOGC TGAAGGCCAT 300 

GOCSAACCCCA OGGTGCGGGG TCCCAGACCT GQGCAGATTC CAAACCTTTO AGGGOQACCT 360 

CAAGTGGCAC CACCACAACA TCACCTATPG GATCCAAAAC TACTOGGAAG ACTTGCOGOG 420 

GGOGGTGATT GAOGACGCCT TTGCCOGOGC CTTCXSCACTG TGGAGOGCGG TGAOGCOGCT 480 

CAOCTTGACT OGOGTGTACA G0C3GGGA0SC AGACATOSTC ATCCAGTTTG GTGTOGCGGA 540 

GCAOGQASAC OGCri'ATCCCT TC3GA0GGGAA GGA0GG6CTC CTGGCACACG CCTTTCCTCC 600 

TGGCCCCGGC ATTCAG66AG ACGCCCATTT OGAOGATGAC GAGTTGTGGT CCCTGGGCAA 660 

GGGCGTCGTG GTTCCAACTC GGTTTGGAAA CGCAGATGGC GCGGCCTGOC ACTT OCCC TT 720 

CATCTTCGAG GGCOGCTCCT ACTCTGCCTG CACXACCGAC GGTOSCTCOG ACGGCTTGCC 780 

CTGGTGCAGT ACCAOOGCCA ACTAOGACAC OOAOGACCGG TTTGGCTTCT GCCCCAGCGA 840 

GAGACTCTAC ACCCGGGACG 6CAAT6CT6A TGGGAAACCC T6CCAGTTTC CATTCATCTT 900 

CCAAGGCCAA TCCTACTCOG CCTGCACCAC GGACX3GTOGC TCOGACGGCT ACC36CTGGTG 960 

CGCCACCACC GCCAACTACG ACCGGGACAA GCTCTTCGGC TTCTGCCCGA CCOGAGCTGA 1020 

CTOGACGGTG ATGGGGGGC3V ACTCGGCGG6 GGAGCTGTGC GTCTTCCCCT TCACTTTCCT 1080 

GOGTAAGGAQ TACTCGACXT GTACCAS08A CGGC060GGA GATG6GG6CC TCTGGTGCGC 1140 

TACCACCTCG AACTTTGACA 6CX3ACAA6AA GTGGGGCTTC TGCCGG6ACC AAGGATACAG 1200 

TTTGTTCCTC GTOGOSGCGC ATGAGTTCGG CCA06CGCTG GGCTTAGATC ATTCCTCAGT 1260 

GC0GGA6GOG CTCATGTACC CTATGTACCG CTTCACTGAG GGGCCCCCCT TGCATAAGGA 1320 

OGACGTGAAT GGCATCCGGC ACCTCTATX3G TCCTCXJCCCT GAACCTGAGC CACGGCCTCC 1380 

AACCACCAOC ACACOQCAQC CCAOQGCTCC CCOSAOGOTC TGOXSCACOG GACCCCTCRC 1440 

TGTCCAOCCC TCAGAGCGOC CCACAGCPGG CCCCACAGGT CCCCCCTTCAO CTGGCCCCAC ISOO 

AGGTCCCCCC ACTGCTGGCC CTTCTAOGGC CACTACTGTG CCTTTGAGTC CGGTGGAOSA 1560 

TGCCTGCAAC GT6AACATCT TCGACGCCAT CGCGGAGATT GGGAACCAGC TGTATTTGTT 1620 

CAAGGATG6G AA6TACTGGC GATTC T CTGA GGGCAGGGGG A6CC6GCCGC AGGGCCCCTT 1680 

OCTTATOGOC GACAA6TGGC COGGGCTGCC CGGCAAGCTG 6ACTCX3GTCT TTGAGGAGCC 1740 

GCTCTCGAAO AAGCTTTTCT TCTTCTCTGG 60GCCAGGTG TGGGIGTACA CAGGC6CGTC 1800 

GGTGCTGGGC CXXSAGGCGTC TG6ACAAGCT GGGCCTGGGA GCXXSACGTGG CXXAGGTGAC 1860 

OGGGGCCCrC CGGAGTGGCA GGGGGAAGAT GCTGCTGTTC AGCGGGCGGC GCCTCTGGAG 1920 

GTTOGAOSTG AAGG06CAGA TGGTGGATCC CCG6AGC6CC AGCGAGGT6G ACCX3GATGTT 1980 

OOCCGGGGTG CCTTTGGACA OGCAOSAOGT CTTCCA6TAC 08A6AGAAA8 CCTATTTCTG 2040 

GCAG6ACG6C TTCTACTGGC G0GTGA5TTC COGGAGTGAa TTGAACCAGG TQtSACCAAGT 2100 

GGGCTAOGTG ACCTATGACA TCCTGCAGTG CCCTGRGGAC TAGGGCTCCC GTCCTGCTTT 2160 

GCAGTGCCAT GTAAATCCCC ACTQGGA<Xa ACCCTGGGGA AGGAGCCAGT TTGCCGGATA 2220 

CAAACTGGTA TTCTGTTCTG GAGGAAAG66 AGGAGTGGAG GTGGGCTGGG CCCTCTCTTC 2280 
TCACCTTTGT TTTTTGTTGG AGTGTTTCTA ATAAACTTGS ATTCICOaiAC CTTT 



8eq ID MO I 126 Protein sequence: 
Protein Accession ftt MP_0049B5.1 

I 11 21 31 41 51 

I I I I 1.1. 

MSLMQPLVLV LLVUGCCFAA PRQRQSTIiVL PPGDLRTNLT DRQLAEEYliY RYGYTRVAEM 60 

RGESKSLGPA LLLLQKQLSL PETGELDSAT LKAMRTPROG VPDLGRFQTF EGDLKWHHHM 120 

ITYWIQNYSB DLPBAVIDDA FARAPALWSA VTPLTFTRVY SRDADIVIQF 6VAEHGDGYP. 180 

FDGKDGLLAH AFPPGPGIQG DARFI»DELH SLGKGVWPT RFGNADGAAC HFFFXPEGRS 240 

YSACTTDGRfi DGLPWCSTTA HYDTDORFGF CPSEEILYTRD GNADGKPCQP PPIFQQQSYS 300 

ACTTDGRSDG YRWCATTANY DRDKLFGPCP TRADSTVMQO NSAGSLCVFP PTFLGKEYST 360 

CTSEGRfflXSR LWCATTSNFD SDKKWGPCPD QGYSLPLVAA HEFGHALGLD HSSVPEALMY 420 

PMYRFTBGPP LHKDDVNGIR HLYGPRPEPB PRPPTTTTPQ PTAPPTVCPT GPPTVHPSER 480 

PTAGPTGPPS AGPTGPPTAG PSTATTVPZiS PVDDAQIVNI FDAIAEXGHQ LYLFKDGKyN 540 

RPSEGRGSRP QGPFIiIADKW PALPRKLDSV FEEPLSKKLF PFSGRQVHVY TGASVUZPRR 600 

IDKZiGLGADV AQVTGALRSG RGKMLLFSGR RIiHRPOVKAQ MVDPRSASEV DRMFPGVPtf 660 
TBDVFQYREK AyPOQDRFYH RVSSRSEWQ VDQVGYVTYD XIiQCPED 



Seq ID NO: 127 ZOIA sequence 
Nucleic Acid Acceasion ft: NM_004181 
Coding sequence t 32-670 

1 11 21 31 41 51 

11)111 

GCAGAAATAG CCTAGGGAGA TCAACCCOGA GATGCTGAAC AAAGTGCTGT CCCGGCTGGG 60 

CGTOGCOG G C CAGIG006CT TOSTGGACGT GCTGG6GCTG GAAGAGGAGT CTCTGG6CTC 120 

OGTGCCAGOO CCT GO CTGOO CGCTGCTGCT GCTGTTTCCC CTCAOQGCCC AGCATGAGAA 180 

CTTCAGGAAA AAGCAGATTG AAGAGCTGAA GGGACAAGAA GTTA6TCCTA AAGTGTACTT 240 

CATGAAQCAG ACCATTGGGA ATTCXrTGTGG CACAATCGGA CTTATTCACG CAGTGGCCAA 300 

lAATCAAGAC AAACTGGGAT TT6AG6ATQG ATCAGTTCTG AAACAGTTTC TTTCTGAAAC 360 

AGAGAAAATG TCCCCTGAAG ACA6A6CAAA ATGCTTT6AA AAGAATGAGG CCATACAGGC 420 

AGCCCATGAT GCC6TG6CAC AGGAAGGCCA ATGTCGGGTA GATGACAAGG TGAATTTCCA 480 

TTTTATTCTG TTTAACAACG TGGATG6CCA CCTCTATGAA CTTGATGGAC GAATGCCTTT 540 

TOOGGTGAAC CATGGOGCCA GTTCSUaAGQA CACCCTOCTQ AAGGAOSCTG CCAAGGTGTG 600 

CAGAGAATTC ACCGAOCGTQ AGCAAGGAQA AGTC06CTTC TCTGCCGTGG CTCTCTGCAA 660 

OGCAGGCTAA TGCTCTGTQG GAGGGACTTT 0CT6ATTTCC CCTCTTCCCT TCAACATGAA 720 

AATATATACC CCXX31T6CAG TCTAAAATGC TTCftGTACTT GTGAAACACA GCTGTTCTTC 780 

TGTTCTGCAG ACAOSCCTTC CCCTCA6CX31 CACXX3U5GCA CTTAAGCACA AGCAGAGTGC 840 

ACAGCTGTCC ACTGGGCCAT TGTGGTGTGA GCTTCAGATG GTGAAOCATT CTCCCCAGTG 900 

TATGTCTTGT ATGOGATATC TAAOGCTTTA AATGGCTACT TTGGTTTCTG TCTGTAAGTT 960 
AAGAOCTTGG AT6TGGTTAT 6TTGT0CTAA AGAATAAATT TTGCTGATAG TAGC 



Seq ID NOi 128 Protein sequence: 
Protein Accession #i NP_004172 
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1 I I t I I 

MZ^IKVLSRLG VAGQWRFVDV U2LEEESL6S VFAPACAIiLL I.FPLTAQHEN FRKKQIEELK 60 

GQEVSPKVYF MKQTIGiraCG TZ6LZHAVAM MQDXL6FEDG SVLSQF1<SBT EKMSPBDSAK 120 

CFEKNEAXOl^ AHDAVAQB6Q CRVDDKVNFH FZLFBNVDGR LYEUXSRMPP FVHBGASSED 180 
TLIiKDAAKVC RBFTEREQGB VRFSAVUCK AA 



Seq ID NOt 129 SNA sequence 
Nucleic Acld Accession «: 13M_000213 
Coding sequence: 127-53S5 

I 11 21 31 41 51 

I I I I 1 I 

GGCC06CX303 CTGCAGOCCC A7CT0CTAGC G6CA60CCA0 60606GAGGG AGC6AGTGGG 60 

CCCCGAGGTA GGTCCA6GAC GGG06CACAG OVGCAGCOGA GGCTGGCOGG GAGAGGGAGG 120 

AAGAGGATGG CAGGGCCACG CCCCAGCCCA TGGGCCAGGC TGCTOrrGGC AGCCTTGATC 180 

A60GTCAGCC TCTCTGGGAC CTT6GCAAAC G6CTSCAAGA AG6CC0CAGT GAAGAGCTGC 240 

AOGGAGTGTG TG0GT6TGGA TAAGGACT6C GCCTACTOOi CAGA03A6AT GTTCAGGGAC 300 

0G6Q6CT6CA ACACCCAGGC GGASCTGCTG GCOQOSGGCT GCCM3C3GGGA GACCATCGT6 360 

GTCATGGAGA GCA6CTTGCA AATCACAGAG GAGACCCAGA TTGACACXIAC CCTGOGGCGC 420 

AGCCAGATGT CCCCCCAAGG CCTGCX3GGTC OGTCTGOGGC COGGTGAGGA GOGGCATTTT 480 

GAGCTGGAGG TGTTTGAGCC ACT6GAGAGC CC06TGGACC T6TACATCCT CATGGACTTC 540 

TCCAACTCCA TGTC0GAT6A TCT6GACAAC CTCAAGAAQA TGGGGCAGAA GCTGGCT0G6 600 

GTCCTGAGCC AGCTCACCAG OGACTACACT ATTGGATTTO GCAAGTTTGT 6GACAAAGTC 660 

AGCGTCCCGC AGACGGACAT GAGGCCTGAG AAGCTGAAGG AGCCCTGGCC CAACAGTGAC 720 

CaXCCTTCT CCTTCAAGAA OGTCATCAGC CTGACAGAAG ATGTGGATGA GTTCOGGAAT 780 

AAACTGCAGO GAGAGCGGAT CTCAGGCAAC CTGGATGCTC CTGAGGGOGO CTTCGATGCC 840 

ATCCTGCAGA CAGCTGTGTG CAOGAGGGAC ATTGGCTGOC GCC0G6ACAG CACCCACCTG 900 

CTGGTCTTCT CCACCGAGTC AGCCTTCCAC TATGAGGCTO ATG6CG0CAA CX3TGCTGGCT 960 

GGCATCATGA GC0GCAAC3GA TGAAOGGTGC CACCTGGACA CCACGGGCAC CTACACOCAG 1020 

TACAGGACAC AGGACTACCC GTCGGTGCCC ACCCTGGTGC GCCTGCTOGC CAAGCACAAC 1080 

ATCATCCCCA TC n T G C l ' G T CACCAACTAC TCCTATAGCT ACTAOGAGAA GCTTCACAOC 1140 

TATTTCCCTG TCTCCTCACT GGGGGTGCTG CAGGAGGACT CGTCCAACAT CGTGGAGCTG 1200 

CTGGAGGAGG CCTTCAATOG GATCOGCTCC AACCTGGACA TCCGGOCCCT AGACAGOCCC 1260 

OGAGGCXrrrC OGACAGAGGT CACCTOCAAG ATGTTCCAGA AGACGAGGAC TGGGTCCTTT 1320 

CACATC06GC QGG6GQAAGT G6QTATATAC CAGGTGCAGC TGCGGGCCCT TGAGCACGTG 1380 

GATGGGAOGC ACGTGTGCCA GCTGCOGGAG GACCAGAAGG GCAACATCCA TCTGAAACCT 1440 

TCCTTCTCCG ACGGCCTCAA GATGGACX5CG GGCATCATCT GTGATGTGTG CACCTGOGAO 1500 

CTGCAAAAAG AGGTGCGGTC AGCTCGCTGC AGCTTCAACG GAGACTTOGT GTGCGGACRG 1560 

TO I G TGTGCA GGGASGGCTG GA6TGGGCA0 ACCTQCAACT GCTCCACCGG CTCTCTGAGT 1620 

GACATTCAGC CCIGOCT6G0 Q6AGGG0QA0 GACAAGCOGT GCTOOGGCOG TGGGGAGTGC 1680 

CAGTGCGOGC ACTGT G TGTG CTACGGCXSAA G6C0GCTACG AGGGTCACSTT CTGCGAGTAT 1740 

GACAACTTCC AGTGTCCCOS CACTTCCXSGG TTCCTCTGCA ATGACX3GAQG AOGCTGCTCC 1800 

ATG66CXAGT GTGTGTX3T6A GCCTGGTTGG ACAGGCCCAA GCTGIXSACTG TCCOCTCAGC 1660 

AATOCCAOCT GCATGGACA6 CAATGGGGGC ATCTGTAATO OAOQIGGCCA CTGTGAGTGT 1920 

G6C06CT6CC ACTGOCAOCA GCAOTOSCIC TACA09QACA CCATCIOOGA GATCAACTAC 1980 

TCGGOGATCC ACC0GG6CCT CTGaSAGGAC CTAOGCTCCT GOGTGCAGTG CCAGGCGTGG 2040 

GGCACXX36CX3 AGAAGAAGGG GCGCACX3TGT GAGGAATGCA ACTTCAAGGT CAAGATQGTG 2100 

GAOGAGCTTA AGAGAOCCGA GGAG6TGGTG GTGOGCTGCT CCTTGOGGGA OGAGQATGAC 2160 

6ACTGCA0CT ACA6CTACAC CAT6QAAGGT GAOSGGGCOC CTGGQCCCAA CAGCACTSTC 2220 

CIGGTGCACA AGAAGAAGGA CTGCOCTCOO G6CTCCTTCT G GTG G CTCAT CCCCCTGCTC 2280 

CTCCTCCrCC TGCX3GCTGCT GGCOCTGCTA CTGCTGCTAT GCTGCAAGTA CTGTGCCTGC 2340 

TGCAAGOCCT GCCTGGGACT TCTCCOGTGC TGCAACOGAG GTCACATGGT GGGCTTTAAG 2400 

GAAQACCACT ACATGCTGOG GGAGAACCT6 ATGGOCTCTG ACX3VCTTGGA CA06CCX3irG 2460 

CTGOGGAGCX} OGAACCTCAA GG6C0STGAC GTGGTCX36CT GGAA6GTCAC CAACAACATO 2520 

CA6066CC1X3 GCTTT6CCAC TCATGOC36CC AGCATCAACC CCACA6A6CT GGTGGCCTAC 2580 

GGGCTGTOCT TGCGOCTGGC COGCCTTTGC ACOGAGAACC TGCTGAAGCC TGACACTOGG 2640 

GAGTGCGCCC AGCTOOGCCA GGAGOTGGAG GAGAAOCTGA ACGAGGTCTA CAGGCAGATC 2700 

TCOSGTGTAC ACAAGCTCCA GCAGA0CAA6 TTCOGGCAGC AGCCCAATGC OGGGAAAAAG 2760 

CAAGACXSVCA CCATTGTG6A CACAGT6CT6 ATGGOGOCCC GCTGG6CCAA 6C0G60CCTQ 2820 

CTGAA6CTTA CAGA6AAGCA GGTG6AACAG AGGQCCTTCC ACBAOCTCAA G GT GG CCCCC 2880 

GGCTACTACA CCCTCACTGC AGACGAGGAC GCCOGQGOCA TGGTGGAGTT CCAGGAOGGC 2940 

GTGGAGCTGG TGGAOGTACG GQTOOCCCTC TTTATCCGGC CTGAGGATGA CGACGAGAAG 3000 

CAGCTGCTG G TGGAGGCCAT OSAOGTGCCC 6CAGGCACT6 CCACCCT06G COGCCGCCTQ 3060 

GIAAACATCA CCATCATCAA GGAGCAAGCC AGAGAOGTGG TGTOCTTTQA GCAGCCTGAO 3120 

TTCTOSgrCA GCC60GG6GA CCAGGTGGCC CGCATCCCTG TCATCCGGCG TGTCCTGGAC 3180 

GGCG66AAGT CCCAGGTCTC CTACC6CACA CAG6ATGGCA OOGOGCAGGG CAAC0GG6AC 3240 

TACATCCCC6 TGGAGGGTGA GCTGCTGTTC CAGCCTGCGG AGGCCTGGAA AGAGCTGCAG 3300 

GTGAA6CTCC TOGAGCTGCA AQAAOTTGAC TCCCTCCTGC GOGGCOGCCA GGTCCGCCGT 3360 

TTOCAOgTCC AOCTCAGCAA CCCTAAGTTT CS06CCCACC TG6GCCA6CC CCACTCCACC 3420 

ACCATCATCA TCAG66ACCC AGAT6AACTG 6AC0GGAGCT TCAGQAGTCA GATOTTSTCA 3480 

TCACAGCCAC CCCCTCACGG OGAOCTGGGC GCCCOQCAQA AGCCCAATGC TAAGGCCGCT 3540 

GGGTCCAGGA AGATCCATTT CAACTOGCTQ CO X Vi'l^JTO GCAAGCCAAT GGGGTACAGG 3600 

GTAAASTACT GGATTCAGGG TGACTCCGAA TCCGAAGCCC ACCTGCTOGA CAGCAAGGTG 3660 

CCCTCAOlOO A8CTGACCAA CCTGPCACCCG TATTGOGACT ATGAGATGAA 6GTOT6C6CC 3720 

TA0QQG6CTC AGGQOSAiGGG ACGCTACAGC TGGCT6GTGT GCIQOGGCAC CCA0CAG6AA 3760 

GTGCCCAGCG A6CCAGGG0G TCTGGCCTTC AATGTCGTCT CCTCCACGGT GACGCAGCTG 3840 

AGCTGGGCTG AGCOGGCTGA GAGCAACGGT QAGATCACAG OGTA0GA6GT CTGCTATGGC 3900 

CTGGTGAAGG AT6ACAAC0G ACCTATTGGO OCCA TOAftGA AAGT6CTGGT TGACAACCCT 3960 

AAGAAOCGGA TOCTGCTTAT TGAGAACCTT OQGQAGTOOC AGCOCTACCQ CTACAOGGTG 4020 

AAG0C8CGCA AC6GG6C06G CTGGGG6CCT GAiSOGGGAGG GCATCATCAA CCTQOCCACC 4080 

CAGOCCAAGA GGCOCATGTC CATCOCCATC ATCCCTGACA TCCCTATCGT G6AG6CCCAG 4140 

AGCGGGGAGG ACTAOGACAG CTTCCTTATG TACAGOGATG AOGTTCTACG CTCTCCATCG 4200 

GGCAGCCAGA GGCCCAGCGT CTCCGATGAC ACTQAGCAOC TGGTGAAZY3G CCGGATQGAC 4260 

TTTGCCTTCC OGGGCAGCAC CAACTCCCTO CACAGGATGA CCAG6A0CAG TGCTGCTGCC 4320 

TATGGCACCC ACCTGAGCCC ACACGTX5CCC CACCGOGTGC TAAGCACATC CTGCACCCTC 4380 
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ACACGGGACT ACAACTCACT GACCCGCTCA GAACACTCAC ACTOGACCAC ACTGCCGAGG 4440 

GACTACTCCA CCCTCACCTC CGTCTCCTCC CACGACTCTC GCCTGACTGC TGGTGTGCCC 4500 

GACAOGCCCA CCOGCCTGGT GTTCTCTGCC CTGGGGCXX31 CATCTCTCAG AGTGAGCTCG 4560 

CAGGAGC06C GGTGCGAG06 GCOGCTGCAG GGCTACAGTG TGGAGTACCA GCIXKriGAAC 4620 

GGCGGTGAGC TGCATCGGCT CAACATCCCC AACCCTGCCC AGACCTOGGT GGTGGTGGAA 4680 

GACCTCCroC CCAACCACTC CTACGTGTTC OGCGTGOGGG CCCAGAGCCA GC5AAGGCTGG 4740 

GGC06AGA6C GTGAGGGTGT CATCACCATT GAATCCCAGG TGCACCOGCA GAGCCCACTG 4800 

TGTCCCCTGC CAGGCTCOGC CTTCACTTTG AGCACTCCC31 GTGCCCCAGG CCOQCTGGTG 4850 

TTCACTGCCC TGAGCCCAGA CTOGCTGCAG CTGAGCTGGQ AGCX3GCCACG GAGGCCCAAT 4920 

GGGGATATOQ TOGGCTACCT GGTGACCTGT GAGAT66CCC AAGGAGGAG6 GCCAGCCACC 4980 

GCATTCOGGG TGGAT6GAGA CAGCCCCXSAG AGCCGGCTGA COGTGCGGGG CCTCAGCGA6 5040 

AA06TG00CT ACAAGTTCAA GGTGCAGGCX: AGGACCACTG AGGGCTT06G 6CCAQAG0GC 5100 

GAGG6CATCA TCACCATAGA GTGOCAGGAT GGAGGACOCT TCCCGCAGCT GGGCAGCOGT 5160 

GCOGGGCTCT TCCAGCACCC GCTGCAAAGC GAGTAC3UX3V GCATCACCAC CACCCACACC 5220 

AG06CCACOQ AGCCCTTCCT AGT6GATGGG CCGACCCTGG GGGCCCAGCA CCTGGAGGCA 5280 

OGCGGCTGCC TCACGCGGCK 1GTGAC0CAG GAGTTT6T6A GGC3GGACACT GAGCACCAGC 5340 

GGAACCCTTA GCACOCACAT G6A0CAACAG TTCTTCCAAA CTTSVCOGCA COCTGCCCCA 5400 

CCCCOGCCAT GTCCCACTAG GCGTCCTCCC GACTCCTCTC C06GAGCCTC CTCAGCTACT 5460 

CCATCCTTGC ACCCCTGGGG GCCCAGCCCA CCCGCATGCA CAGAGCAGQG GCTAOGTGTC 5520 

TCCXG6GA0G CATGAAGGGG GCAAGGTCCX3 TCCTCT6TG6 GCCCAAACCT ATTT6TAACC 5560 

AAAGA6CIGG GA6CA6CACA AGGACOCAGC Cm V lTCitf CACTTAATAA ATS G TTTT G C 5640 
ACT6 



Seq ID NO I 130 Protein sequence t 
Protein Accession S: NP_000204 

1 11 21 31 41 51 

I 1 i 1 I I 

KAGPRPSPWA RLLLAALISV SLSGTLANRC fOCAFVXSCTE CVRVDKEXIAy CTDEMFKDRR 60 

CNTOAELLAA GGQRESIWN ESSFQITEET QZOTTUIRSQ KSPQ6LKVRL RP(SBRHFEL 120 

EVFEPLESPV DIiYILMDFSN SMSDDIiDMZiK KMSQNLARVL SOLTSDYTZG FGXPVDKVSV 160 

PQTDMRPEKL KEPWPNSDPP FSFKNVISLT EDVDEFRNKL QGESISGKLD AFSGGFDAIL 240 

QTAVCTRDIG HRPDSTHLLV FSTESAFHYE ADGANVLAGI MSRNDERCHL DTTGTTrQYR 300 

TQDYPSVFTL VRLXAKHNII PIFAVTNYSY SYYEKLSTYF FVSSXiGVLQE DSSHIVELLB 360 

EAFHRIRSHL DZRALDSPRG URTEVTSKMF QKTBT6SFHX RRQEVOZYOV QIiRALEHVDG 420 

TKVCX2LPEDQ K6HIKLKPSP SD6LKMDAGI lOTVCTCBLQ KBVRSARCSP N6DFV0GQCV 480 

CSBQWSGQTC HCSTGSLSDI QPCLREGEDK PCSGRGBCQG CmCVCYGBGR YBGQFCEYQH 540 

FQCPRTSGFL OORGRCSKG QCVCEPGHT6 PSCDCPLSNA TCIDSNGGIC NGRGHCEOSR 600 

C3ICHQQSLYT DTICEZNYSA IHPGLCEDLR SCVQOQAWGT GSKKGRTCEB CNFKVKMVOE 660 

LKBAEEVWR CSFRDEDTOC TYSYTNBQDG AFOFNSTVLV HKKXDCPPGS PmniZPLLLL 720 

LLPLLALLLL ZiCffKYCACCK ACLALI.PCCM RGBMVGFKED RVNIiRENLMA SDHUyTPMLR 780 

SGNLKGia)W RHKV7NNMQR PGFATBAASI NPTELVPYGL SUUiARLCTE NLLKPDTREC 840 

AQLRQEVEEN UJEVYRQISG VHKLQQTKFR QQPNAGKKQD HTIVDTVLMA PRSAKPALLK 900 

LTEEQVEQRA FHDLKVAPGY YTLTADQOAR GMVEFQBGVE LVDVRVPLFI RPEDDDEKQL 960 

LVEAIDVPAG TATLGRRLVN ITIIKBQARD WSFEQPEFS VSRGDQVARZ PVZRRVUN3G 1020 

KSQVSYRTQD GTAQGI1R0Y2 PVEGELLFQP GEAWKELQVK LLELQEVDSL LRGRQVRRFH 1080 

VQLSNPKFGA HLQQPBSTTI XIRDPDELDR SFTSQKLSSQ PPPHGDLGAP QNPHAKAAG5 1140 

RKIHFNWLPP SGXPMGYRVK YHIQGDSBSE AHLLOSKVPS VELTVLYPYC DYEMKVCAYG 1200 

AQGEGPYSSL VSCRTHQSVP SEPGRLAFMV VSSTVTQLSH AEFAETiaGEI TAYEVCYGLV 1260 

NDONRPIGPH XXVLVDHFXK RMLLZENLRB SQpyRYTVKA R1IGA6H6PBR BAIINLATQP 1320 

KRPHSZPZIP DZPIVDAQSQ EDYDSFLMYS DDVZiRSPSGS QRPSVSDDTB HLVNQRMDFA 1380 

FFGSTNSLHR MTTTSAAAYG 7BLSPHVFHR VLSTSSTLTR DYNSLTRSBH SHSTTLPRDY 1440 

STLTSVSSKD SRLTAGVPDT PTRLVPSALG PTSLRVSWQE PRCERPLQGY SVEYQLUJGG 1500 

EEJIRLNIPNP AQTSWVEDL LPNHSYVFRV RAQSQE6HGR ERBGVITIES QVHPQSPLCP 1560 

LPGSAFTLST PSAPGPLVFT ALSPDSLQLS WERPRRPKO? IVGYLVTCEM AQGOGPATAF 1620 

RVDGDSPBSR LTVPGLSENV FYKFKVQART TEGFGPERB6 IITIE5QDGG PFPQL6SSAG 1680 

LFQBPLQSSY SSXTTIBTSA TEPFLVD6PT LGAQHXiBASG SIiTRHVTQEF VSRTLTTSGT 1740 
X«STBMDQQFF QT 

Seq ID I^: 131 DNA sequence 
Nucleic Acid Accession $: BC004372 
Coding sequences 132.. 2231 

1 11 21 31 , 41 51 

I.I I I I I 

CCTOGTGCCG CXXSACCCCAG CCTCTGCCAG GTTCGGTCOQ OCATCCTCOT CCOGTCCTCC 60 

GGOGGCCCCT GCCXXX3CGCC CAGGGATCCT CXrAGCTCCTT TOSCCCGOQC CCTC05TTG6 120 

CT006GACAC CAT66ACAA6 TTTTGGT66C A06CAGCCT0 GGGACTCTGC CTOGTGCOGC 180 

TGAGCCTG6C GCAGATCGAT TTQAATA7AA CCTGCCGCTT TGCA6GTGTA TTOCAOGTGG 240 

AGAAAAATGG T06CXACAGC ATCTCTOGGA OGGAGGCCGC TGACCTCTGC AAGGCTTTCA 300 

ATA6CACCTT GCCCACAAT6 GCCGAGAT66 AGAAAGCTCT GA6CATGGGA TTTGA6ACCT 360 

GCAG6TATG6 GTTCATAGAA QOOCATOTG G T6ATTCCCC6 GATCCACCCC AACTCCATCT 420 

GT6CAGCAAA CAACACAG6G GTGTACATOC TCACATCCAA CACX7CCCAG TAT6ACACAT 480 

ATTOCTTCAA TGCTTCAGCT CCACCIGAAG AAGATTGTAC ATCaWTTCACA GACCTGCCCA 540 

ATGCCTTTQA TQQACCAATT ACCATAACTA TTGTTAACCG TGATGGCACC CGCTATGTCX: 600 

AOAAAGGAGA ATACAGAACG AATCCTGAAS ACATCTACCC CA6CAACCCT ACTGA7X3ATG 660 

A0GTGA6CAG OGOCTCCTCC A6TGAAA6GA GCAGCACTTC AGQAGGTTAC ATCTTTTACA 720 

CCTTTTCTAC TGTACACCCC ATCCCAGAOO AAOACAGTCC CTGGATCACC GACA6CACA6 780 

ACAGAATCCC T6CTACCA6T AOGTCTTCAA ATACCATCTC AGCAGGCTGG GAGCCAAATG 840 

AAOAAAATGA AGATGAAAGA GACAGACACC TCAGTTTTTC TGGATCAG6C ATTGATGATG 900 

ATGAAGATTT TATCTCCA6C ACCATTTCAA CCACACCA06 66CTTTTGAC CACACAAAAC 960 

AGAACCAGGA CTGGACOCAO TQGAACOCAA GCCATTCAAA TGOOOAAGTO CTACntlAOA 1020 

CAACCACAAG GATGACT6AT 6TAGACAGAA ATGOCACCAC TQCTTATGAA GGAAACTOQA 1080 

ACCCAGAAOC ACACCCTCCC CTCATTCACC ATGAGO^CA TGAGGAAGAA GAC3VC0CCAC 1140 

ATTCTACAAO CAGAATCCAG GCAACTCCTA 6TAGTACAAC GGAAQAAACA GCTAOCCAGA 1200 

AGGAACA6T0 GTTTGGCAAC AGATGGCAT6 A6GGATATCG CCAAACACCC AGAGAAGACT 1260 
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CCCATTOGAC AACAGGGACA GCTGCAGCCT 
GGACAACACX: AAGCCCAGAG GACAGTTCCT 
CCATQGGAOO A6GTCATCAA GCAGGAAGAA 

cscrrcAscc tactgciuuvx ccjuu^cacag 

CTCTTTCAAT GACAAC3SCAQ CAGAGTAATT 
TGGAAGAAGA TAAAGACCAT CX^ACAACXT 
TCACAGGTGG AAGAAGAGAC CCAAATCATT 
ATACCTCTCA TTACCCACAC AOGAAOGAAA 
AGACTGGGTC CTTTGGAGTT ACTGCAGTTA 
GTTCCTTATC AGGAGACCAA GACACATTCC 
GATCTGAATC AGATG6ACAC TCACA7GGGA 
GTGCTATAAG GACACCCCAA ATTCCAGAAT 
TQGCTTTGAT TCTTGCAGTT TGCATTGCAG 
AAAAGCTAGT GATCAACAGT GQCAAT6GAG 
AGGGAGAGGC CAGCAAGTCT CAGGAAATGG 
CTCCAGAGCA GTTrATGACA GCTGATf^GA 
TTGGGCTG TA ACACCTACAC CATTATCTTG 
TACAGQQAGC TOGGACACIT AACAGATQCA 
TTTTTAGCAT AAAATTTTCT. ACTCTTAAAA 



CAGCTCATAC CAGOCATCCA ATGCAAGGAA 1320 

GGACTGATTT CTTCAACCCA ATCTCACACC 13 BO 

GGATGGATAT GGACTCCAGT CS^TAGTACAA 1440 

GTTTGGTGGA AGATTTG6AC AGGACAG6AC ISOO 

CTCAGAGCTT CICTACATCA CATGAAGGCT 1560 

CTACTCTGAC ATCAAGCAAT AGGAAT6ATG 1620 

CTGAAGGCTC AACTACTTTA CTGGAAGGTT 1680 

GCAGQACCTT CATCCCAGTG ACCTCAGCTA 1740 

CTGTTGGAGA TTCCAACTCT AATGTCAATC 1800 

ACCCCAGTGG GGGGTOOCAT ACCACTCATG 1860 

GTCAAGAAGG TGQAGCAAAC ACAACCTCTG 1920 

GGCTGATCAT CTTGGCATCC CTCTTGGCCT 1980 

TCAACAGTOG AAGAAGGTGT GGGCAGAA6A 2040 

CTGTGGAGGA CAGAAAGCCA AGTGGACTCA 2100 

T6CATTTGGT GAACAAGGAG TCXTTCAGAAA 2160 

CAAGGAACCT GGAGAATGTQ GACATGAAGA 2220 

GAAAGAAACA ACOGrTGGAA ACATAACCAT 2280 

ATGTGCTACT GATTGTTTCA TTGOGAATCT 2340 
AAAAAAAAAA AAAAAAA 



Seq ID NOt 132 Protein Bequencet 
Protein Accession «: AAH04372 



1 11 21 31 41 51 

I I I I I t 

MDKFVniHAAW GLCLVPLSLA QIDLSIITCRF AGVFHVEKUQ RYSXSRTEAA DLCKAFNSTL 60 

PTMAQMBKAL SIGFETCRYG PIEGHWIPR IHPNSICAAK NTGVYILTSN TSQYDTYCFN 120 

ASAFPEEDCT 8VTDLFSIAFD GPITITXVNR XX3TRYVQKGB YRTNPBDIYP SNPTDDDVSS 180 

65SSERSSTS GGYXFYTFST VHPIFDEDSP WZTDSTDRZP ATSTSStlTZS AGWEPNEBre 240 

DERDKHIjSFS GSGXDDDEDF XSSTXSTTPR AFDBTKQNQD WTQWNFSHSN PEVLLCrrri'R 300 

MTDVDENGTT AYBGNHNPEA HPPLIHHEHH EESEETPHSTS TIQATPSSTT EBTATQKBQK 360 

FGNRMHBGYR QTPREDSHST TGTAAASAHT SHPMQGRTTP SPEDSSWTDF FNPISHPMOR 420 

GBQAGRRKDM DSSBSTTLQP TANPHTGLVB DLDRT6PLSM TTQQSMSQSF STSEBGLEED 480 

KDHPTTSTLT 8SHRKDVTGG RROPNHSBG3 TTLXiBGYTSB YPHTKKSRTP IPVTSAKTGS 540 

FGVTAVTVta? SNSHVNRSLS tSDQfftPBPSG GSBTTH6SES DGHSHGSQBG GANTT5GPIR 600 

TPQXPE»7LXX LASLLAXALX LAVCXAVHSR BROGQKKKLV XNSGUGAVEO RKPSGLKGEA 660 
SKSQEMVBLV NKBSSETPDQ FMTADETRNL QNVDMKZ6V 



Seq ID NO I 133 DNA sequence 
Kucleic Acid Accession «t NM_0028a2 
Coding sequence: 150-755 



1 XI 31 31 41 51 

I 1 11 I I 

0GAGGTTC6Q GT0GTGG6QC GGAGGGAA6A 6CGGGCGGGC GGGAGGCGCC GGOGCCAGAC 60 

GOSGAGGGAA GGAGCTACGA GTAGCCGCOG AGAGGCCGCG GAGCCAGCGA OGACOQACCC 120 

AGCCGAGCOS COSCCGCCGC 0G06CCC0CA TGGOGGCCGC CAAGGACACT CATGAGGACC 180 

AT6ATACTTC CACT6AGAAT ACAGACQAGT CCAACCATGA 00CTCA6TTT GAG0CAATA6 240 

TTTCTCTTCC TGAGCAAGAA ATTAAAACAC TGGAAGAAGA TGAAGAGGAA CTTTTTAAAA 300 

TGCGGGCAAA ACTGTTCCGA TTTGCCTCTG AGAACGATCT CCCAGAATGG AAGGAGOGAO 360 

GCACXGGTGA 0C3TCAAGCTC CTGAAGCACA AGGAGAAAGG GGCCATCCGC CTCCTCATGC 420 

GGAGGGACAA GACCCTGAAG ATCTGTGCCA ACCACTACAT CACGC06ATG ATGGAOCTGA 480 

AGGCCAA06C AGGTAGGGAC OGTGOCTG GG TCIGGAACAC CCA06CTGAC TT0GG0GA06 540 

AGTGOCCCAA GCCAGAGCTG CT6G0CATCC GCTTCC7GAA TQCT6AGAAT GCACAGAAAT 600 

TCAAAACAAA GTTTGAAGAA TGCAGGAAAG AGATCGAA6A GAGAGAAAA6 AAAGCAGGAT 660 

CAGGCAAAAA TGATCATGCC GAAAAAGTGG CGGAAAAGCT AGAAGCTCTC TOGGTGAAGO 720 

AGGAGACCAA GGAGGATGCT GAG6AGAAGC AATAAATCGT CTTATTTTAT TTTCTTTTCC 780 

TCTCTTTCCT TTCCTTTTTT TAAAAAATIT TACOCT GC CX: CTCTTTTTCG GTTTGTTTTT 840 
ATTCTTTCAT TTTTACAAGG GAC6TTATAT AAAGAACIQA ACTC 



Seq XD KO: 134 Protein sequence: 
Protein Accession #: KP_002873 

1 11 31 31 41 51 

I I ! I I I 

MAAAKDTHED HDTSTBim)E SNEDPQFEPX VSLFEQEZKT LHEDEEELFX MRAKLFRFAS 60 
ENDLPEWKER GTGDVKLLKH KEKSAIRXiLM RRDKTXiKICA KHYXTPMNBL KFNA6SDRAH 120 
VWNTHADFAD ECPKPBLLAI RFLNABIAQK FKTKFSEC3UC BIEERBKXA6 SGXNDHAEKV 180 
AEXLEALSVK EETREDAEEK Q 



Seq ID NO: 135 DNA Sequence 

Nucleic Acid Accession ft: NM_000077.2 

Coding sequence: 277-742 ^ 



1 11 21 31 41 51 

I I I ! I I 

CCCAACCTOG GGOGACTTCA GQTGTGCCAC ATT06CTAAG TGCTCGGAGT TAATAGCACC 60 

TCCTCOSAGC ACTCGCTCAC GGCGTCCCCT TGOCTtSGAAA GATACOGOGG TCCCTCCAGA 120 

QGArrTGAGG GACAGGGTOG GAGGGGGCTC TTCC60CAGC ACCGGAGGAA GAAAGAGGAG 180 

6G6CTG6CT6 6TCACCAGAG GGTGGGG06G ACOQOGT800 CTCG6066CT GOQGAGAGGG 240 

GGAGAGCAGG CA60G66GG6 0GGG6AGCAG CATGGAGCOG GC6G08GQGA GCA6CATG6A 300 

GCCTTCGGCT GACTGGCPQG OCAOGGCOOC GGCCCGGGGT OGGGTAGAOG AGGTGCGGGC 360 

GCTGCTGGAG GCGGGGGOOC TGCCCAACGC ACOGAATAGT TAOGGTOGGA GGCCGATCCA 420 

GGTCATGATQ ATGGGCAOOO CCOGAGTGGC GGA6CTGCTG CTGCTCCAOG GOGOGGAGOC 480 
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CAACTGCGCC GACCCCGCCA CTCTCACCXXS ACCCGTGCAC GAC3GCrGCCX: GGGAGGGCTT 
CXTGGACAOG CTGGTGGTGC TGCACOGGGC OGGGGOGOGG CTGGAOGTGC GOGATGCCTO 
GGGCOGTCTG CCOGTGGACC TGGCT6AGGA GCTGGGCCAT 0GOGATGT08 CAOGGTACCT 
GCGCG0G6CT GGGGGGGGOV CCAGA6GCA0 TAACCAT60C 06CATA6ATG COGOQGMGG 
TCCCTCAGAC ATCCCOGATT GAAAiCSAACCA GAGAGGCTCT GAGAAAOCTC GGGAAACTTA 
GATCATCAGT CACCGAAGGT CXTTACAGGGC CACAACTGCC OCOGCCACAA CCCACCCOGC 

•rrrOGTAGTT ttcatttaga aaatagagct tttaaaaatg tcctgccttt taaogtagat 

ATATGCCTTC CCCCACTACC GTAAATGTCC ATTTATATCA TTTTTTATAT ATTCTTATAA 
AAATGTAAAA AAGAAAAACA CCGCTTCTGC CTTTTCACTG TGTTGGAGTT TTCTGGAGTG 
■ACCACTCAOG CCCTAAGOGC ACATTCATCT GGGCATTTCT T6C3GAGCCTC GCAGCCTCOS 
GAAGCTGT08 ACTTCATGAC AA6CATTTTQ TGAACTAGG6 AAGCTCAGG6 GGGTTACTGG 
CTTCTCTTGA GTCACACTGC TASCAAATGG CA6AACCAAA 6CTCAAATAA AAAXAAAATA 
ATTTTCATTC ATTCACTC 

Seq ID HOt 136 Protein sequence*. 
Protein Accession #< NP_000066.1 



41 



51 



TGTGTGGG G G 
GCCCCCACCC 
C0GAGTGG06 
TCTCACCOGA 
GCACOGGGCC 
GGCTGAGGA6 
CAGAGGCAGT 
AAAGAACCAG 
CTACAQ(3GCC 

aatagagctt 
taaatgtcca 
cgcttctgcc 

CATTCATGTG 
A6CATTTTGT 
AGCAAATGGC 



11 

I 

TCTGCTTGGC 
TGGCTCTGAC 
QAGCTGCTGC 
CCCGTGCA06 
GGGGCX30GGC 
CTGGGOCATC 
AAOCATGCCX: 
AGAGGCTCTG 
ACAACTGCCC 
TTAAAAAT8T 
TTTATATCAT 
TTTTCACTQT 
GGCATTTCTT 
GAACTAGGGA 
AGAACCAAAG 



21 
I 

GGTGA66GQG 
CATTCTGTTC 
TGCTCCACGG 
ACGCTGCCOG 
TGGACGTGGG 
GCGATOTOGC 
GCATAGATGC 
AGAAACCTCG 
C06CCACAAC 
CCTOOCTTTT 
TTTTTATATA 
GTTGGAOTTT 
GCGAGCCTCG 
AGCTCAGGGG 
CTCAAATAAA 



31 
I 

CTCXACACAA 
TCTCTGGCAG 
060GGAGOCC 
GGAGG GCTTC 
CGATGOCIOG 
ACGGTAOCTG 
C60QGAAGGT 
GGAAACTTAG 
CCAOCOOGCT 
AACGTAGATA 
TTCTTATAAA 
TCTGQAGTGA 
CAGCCTCCGG 
GGTTACTGGC 
AATAAAATAA 



41 

I 

GCTTCCTTTC 
6TCATGATGA 
AACTG06C0G 
CrGGACAOSC 
OGCOGTCTGC 
060QCGGCT6 
CXXTCACACA 
ATCATCAGTC 
TTOGTAGTTT 
TAAGCCTTOC 
AATGTAAAAA 
GCACTCAOGC 
AAGCTGTCGA 
TTCTCTTGAG 
TTTTCATTCA 



540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



PCTAJS02/12476 



1 11 21 31 

i I I I i i 

MEPAA6SSMB PSADWLATAA ARGRVEBVRA LLEAGALFNA PNSYGRRPIQ VMMKGSARVA 
ELLLUIGAEP NCADPATLTH PVHDAARBQP LDTLWIARA GARLDVRDAM GRIiPVDIjAEB 
USaSDVPSYh RAAA66TRGS NHARIOAAEG PSDIPD 



Seq ID NO: 137 ONA sequence 

Nucleic Acid Accession S: NM_058196.1 

Coding sequence: 104-421 ~ 



51 
I 

0GTCAT6C06 
TGGGCAGCGC 
ACCCCGCCAC 
TGGT6GT6CT 
CCGTGGACCT 
OSGGGGGCAC 
TCCCCGATTG 
ACCGAAGGTC 
TCATTTAGAA 
CCGACTAC08 
A6AAAAACAC 
CCTAAGCGCA 
CTTCATGACA 
TCACACTGCT 
TTCACTC 



60 
120 



60 
130 
ISO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



50 
55 
60 
65 
70 
75 
80 
85 



Seq ID KOi 138 Protein sequence: 
Protein AccesBion «i Kp 478103.1 



1 11 21 31 41 51 

I I i I I I 

MMMGSARVAE LLLLEGAEFH CADPATLTRP VHDAAREGPL DTLWLHRAO ARLDVRfiANQ 
RIiFVDIiAEEL GHRDVARYLR AAAOGTRGSN HARIDAAE6P SDIPD 

Seq ID NO: 139 DNA sequence 

Nucleic Acid Accession fti NM_058197.1 

Coding sequence: 272-684 



CCCAACCTGG 
TCCTCOGAGC 
GGATTTGAGG 
6GGCTGGCTG 
GGAGAGCA86 
G0CG6CG6CG 
GGGT06GGTA 
TAGTTAOGGT 
OGGGOGACTC 
CCGGAAAAAG 
TGCTGGOSAC 
ACAGATCTCT 
TCATGATGAT 
ACTGCGCC6A 
TGQACACQCT 
GCCGTCT6CC 
GCQCQOCTGC 
CCTCAGACAT 
CATCAGTCAC 
OGTAGTTTTC 
TGCCTTCOCC 
T6ZAAAAAAG 
ACTCAC6CCC 
6CTGTCGACT 
CTCTTGRGTC 



11 

I 

GGOGACTTCA 
ACT06CTCAC 
GACAGGGTOG 
GTCACCAOAG 
CAGC06G0GG 
6GGA6CA0CA 
GAGGAGGTGC 
CGGAG6CCGA 
TGGAGGACG/V 
QGGAGGCTTC 
G00CT6GGQG 
06AAXGCTGA 
GG6CAGCGCC 
CCCCGCCACT 
GGTQGTQCT6 
OGTGQAOCIG 
GGGGQGCACC 
CCCOGATTGA 
CGAAGGTCCT 
ATT7AGAAAA 
CACTACOGIA 
AAAAACACCG 
TAAGOGCACA 
TCATGACAAG 
ACACTGCTAG 



21 
1 

GGTGT6CCAC 
GGGGTGCCCT 
GAGGGGGCTC 
GGTGGG006G 
C6GGGAGCAG 
TGGAGGCTTC 

TCCAGGTGGG 
AGTTTGCAGG 
CTGGGGAGTT 
CTTGGGAAAC 
GAAGATC7GA 
CX3AGTGQCGQ 
CTCACCCGAC 
CACCGG6CCG 
6CTGAGGAGC 
ASA06CA6TA 
AAGAACCAGA 
ACAGGGCCAC 
TAGAGCTTTT 
AATGTCCATT 
C1TCT6CCTT 
TTCATGTGGG 
CATTTTGTGA 
CAAATGGCAO 



31 
1 

ATTOGCTAAG 
T0CCT66AAA 
TTCCSCCRGC 
ACCGCGTGCG 
CATGGAGCC6 
GGCTGACTGG 
GGAGGOGGGG 
TAQAAGGTCT 
GGAATTGGAA 
TTCAGAAGGG 
C RAGGAftG AG 
AGGG6GGAAC 
AGCTGCTGCT 
C0GTGCACX3A 
GGGQGOGGCT 
1Y3GGCCAT0G 
AGCAT60C0Q 
GAGGCTCTGA 
AACTGCCCCC 
AAAAATGTCC 

TTCACTOTGT 
OlTTTCTTGC 
ACTAGG6AAG 
AACCAAAGCT 



41 

1 

TGCT0G6AGT 
GATACCQOGG 
ACCGGAGGAA 
CTCGGCGGCT 
GCGGCGGGGA 
CT6GCCA0SG 
GCGCT60CCA 
GCAGCGGGAG 
TCAGGTAGOS 
GTTTGTAATC 
GAA TGAGG AG 
ATAnTGTAT 
OCTCCAOQGC 
CGCTGCCGGG 
GGAQGTGOGC 
C8ATGTGGCA 
CATA6AT0GC 
GAAACCTCGG 
GCCACAACCC 
T6CCTTTTAA 
TTTATATATT 
T6QAGTTTTC 
GAGCCTC6CA 
CTCAGGQGGO 
CAAATAAAAA 



51 
1 

TAATAGCACC 
TCCCTCCA6A 
GAAAGAGGA6 
GCGGAGAGGG 
6CAGCATGGA 
COGCGGCCOQ 
ACGCACCQAA 
CAG6GGATGG 
CTTCGATTCT 
ACAGACCTOC 
CCA06CG0ST 
TAGATG(3UU3 
G06GAGCCCA 
GAGGGCTTCC 
GATGOCTGGG 
CS6TA0CTGC 
GOQGAA0QTC 
GAACTTAGAT 
ACCCC6CTTT 
CGTAGATATA 
CTTATAAAAA 
T66A6TGAGC 
GCCTCCGGAA 
TTACTGGCTT 
TAAAA'TAATT 



60 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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TTCATTCATT CACTC 

Seq ZD NO: 140 Protein sequence t 
Protela Accession |: IIP_478104.1 

1 11 21 31 

I I ) i 

MEPAAGSSMB PAAGSSKEPS AEMLATAAAR 6RVEEVRALL 
RRSAAGACaXS GRLHRTKFAG ELSSGSASZL RKXORLPGBP 

Seq ID NO: 141 DMA sequence 
Nucleic Acid Accession fit NM_058195.l 
Coding sequence: 163-684 

1 11 21 31 41 51 

1)1111 

CCTCCCTACG GGOSCCTCOG GCAGCCCTTC COGOGTGOGC AGGGCTCAGA GCOGTTCCGA 60 

GATCTTGGAG GTCOGGGTGG GAGTGGGGGT G0GGTGGG6G TGGGGGTGAA GGTGGGGGGC 120 

GGGG606CTC AGGGAAGGQG GGTGOGOGCX: TGCX3GGG0GG AGATGGGCAG GGGGCGGTGC 180 

OTGO O TCCCA GTCTGCAGTT AAGGGGGCAG GAGTGGCGCT GCTCACCTCT GGTGCCAAAG 240 

GGOGGOGCAQ 0GQCTGC06A GCTCGGCCCT GGAGGOGGCG AGAACATGGT G06CAGGTTC 300 

TTGGTGACCC TCCX3GATT06 GC60GCX3TGC GGCCCX5C0GC GAGTGAGGGT TTTCGTGGTT 360 

CACATCCCGC GGCTCAOGGG GGAGTGGGCA GCGCCAOGQO OGCCCGCOGC TGTGGCCCTC 420 

GTGCTGATGC TACTGAGGAG CCAGOmrCA GGGCAGCAGC OGCTTCCTAG AAGACCAGGT 460 

CATGATQATG GGCAGOGCCC GAGTGSGQGA GCTGCTGCIO CTGCAOGGOG OGGAOOCCAA 540 

CTG06C06AC CC06CCACTC TCACOOGACC C6TGCA0QAC 6CTGCCCX3GO AGG6CTTCCT 600 

GGACACGCTO GTGGTGCTGC ACOSGGCaSG GGCX5C3GGCTG GAOGTGCGOG ATGCCTGGGG 660 

COGTCTGCCC GTGGACCTGG CTGAGGAGCT GQGCCATOGC GATGTCX3CAC GGTACCTGOS 720 

CG0GGCTGO6 GGGGGCACCA GAGGCAGTAA CCATG0CX3QC ATAGATGCOG OSGAA6GTOC 780 

CTCAGACATC C0C6ATTGAA AGAACCAGAG AGGCTCT0A6 AAACCTGGOO AAACTTAQAT 840 

CATCAGTCAC OGAAGGTCCT ACA66GCCAC AACTGOOCCC GCCACAACOC ACCCCGCTTT 900 

CX5TAGTTTTC ATTTAGAAAA TA6AGCTTTT AAAAATGTCC TGCCTTTrAA OGTAGATATA 960 

TGCCTTCCCC CACTACCGTA AATGTCCATT TATATCATTT TTTATATATT CTTATAAAAA 1020 

T6TAAAAAAG AAAAAC3VCGG CTTCTGCCTT TTCACTGTGT TGGAGTTTTC TG6AGTGAGC 1080 

ACTCAC6CCC TAA606CACA TTCAT6TGGG CATTTCTTGG QAGCCTGGCA OCCTOOGGAA 1140 

6CTGTGGACT TCATGACAAG CATTTTGTGA ACTAG66AA0 CTGA6GGGG6 TTACTG6CTT 1200 

CTCTTGAGTC ACACT6CIA6 GAAATGGCA6 AACCAAAGCT CAAATAAAAA TAAAATAATT 1260 
TTCATTCATT CACTC 

Seq ZD NOt 142 Protein sequence: 
Protein Accession 9: NP_478102.1 



1 11 21 31 41 51 

I I I I I I 

M8R6RCV6PS XiQLRGQEWRC SPLVPKSGAA AAELQPGQQB NMVRRFLVTI. RZBRAC6PPR 60 
VRVFWBZPR LTGEHAAPGA PAAVALVLML LRSQRIiGQQP LPRRPGHDOG QRPS6GAAAA 120 
PRRGAQLRRP HKSBPTRARR CPGQLPGBAO GAAPGRGAA6 RARCLGPSAR OPG 

Seq ZD NO I 143 DHA sequence 
Nucleic Acid Accession ft: NM_0I813l 
Coding sequence > 412 . . 1107 



1 11 21 31 41 51 

I I 1 I I I 

GAAATTGC3VC ACTTAAAGAC ATCA0T6GAT GAAATCACAA GTGGGAAAGG AAAGCTGACT 60 

GATAAAGAGA GACAGAOACT TTTGGAGAAA ATTCGAGTCC TTGAGGCTGA GAAGGAGAAG 120 

AATGCTTATC AACTCACAGA GAAGGACAAA GAAATACAGC 6ACTGAGA6A CCAACTGAAG 180 

QdCAGATATA GTACTACOGC ATTGCTTGAA CAGCTGGAA6 AGACAA06AG AGAA6GASAA 240 

AGGAGGGAGC AGGTGTTGAA AGCCTTATCT GAAGAGAAA6 A06TATTGAA ACAACAGTTG 300 

TCTGCTGCAA CCTCACGAAT TGCTGAACTT GAAAGCAAAA CCAATACACT CCGTTTATCA 360 

CAGACTGTGG CTCCAAACTG CTTCAACTCA TCAATAAATA ATATTCATGA AATGGAAATA 420 

CAGCTGAAAG ATCCTCTOGA GAAAAATCAG CAGTGGCTOG TGTATGATCA GCAGOGGGAA 480 

QTCTATGTAA AAGGACTTTT AOCAAAGATC TTTGAGTTGG AAAAGAAAAC GGAAACASCT 540 

GCTCATTCAC TCCCACAGCA GACAAAAAAG CCTGAATCAG AAGOTTATCT TCAAOAAGAO 600 

AAGCAGAAAT GTTACAAGGA TCTCTTGGCA AGTGCAAAAA AAGATCTTGA GGTTQAAOGA 660 

CAAACCATAA CTCAGCTGA6 TTTTGAACTG AGTGAATTTC GAAGAAAATA TGAAGAAACC 720 

CAAAAAGAAG TTCACAATTT AAATCAGCTG TTOTATTCAC AAAGAAGGGC AGATGTGCAA 780 

CATCTGGAAG ATGATAGGCA TAAAACAGAG AAGATACAAA AACTCAGGGA AGAGAATGAT 840 

ATTGCTAGGG GAAAACTTGA AGAAGAGAAG AAGAGATCCG AAGAGCTCTT ATCTCAGGTC 900 

CAGTCTCTTT ACACATCTCT GCTAAAGCAO CAAGAAGAAC AAACAAGGQT AGCTCTGTTG 960 

GAACAACAGA TGCAGGCATG TACTTTAGAC TTTGAAAAT6 AAAAACT06A COGTCAACAT 1020 

GT6CAGCATC AATTGCATGT AATTCTTAAG GAGCTCCGAA AAGCAAGAAA AAATAACACA 1080 

GTTGGAATCC TTGAAACAGC TTCATGAGTT TGCCATCACA 6AGCCATTAG TCACTTTCCA 1140 

AGGAGAGACT GAAAACAGAG AAAAAGTTGC 06CCTCA0CA AAAAGTCOCA CTGCTGCACT 1200 

CAAT66AA0C CTGGTGGAAT GTCGCAAGT6 CAATATACAG TATCCAGCCA CT6AGCATC0 1260 

CGATCTOCTT GTCCAT G T G G AATACTGTTC AAA6TAGCAA AATAAGTATT TGTTTTGATA 1320 

TTAAAAGATT CAATACTGTA TTTTCTQTTA GCTTGTGGGC ATTTTGAATT ATATATTTCA 1380 

CATTTTGCAT AAAACTGCCT ATCTAOCTTT QACACTCCAG CATGCTAGTG AATCATGTAT 1440 

CTTTTAGGCT GCTGTOCATT TCTCTTGGCA GTGATACCTC CCTQUSVTGG TTCATCATCA 1500 

GGCTGCAAT6 ACAGAATST6 GT6AGCAG06 TCTACTGAGA TACTAACATT TTGCACTGTC 1560 

AAAATACTTG GT6A0GAAAA GATAGCTCAS GTTATTGCTA ATGGGTTAAT GCACCA6CAA 1620 

GCAAAATATT TTATGTTTCG GGGOTTTTGA AAAATCAAAG ATAArTAACC AAGGATCTTA 1680 

ACTGTGTTOG CATTTTTTAT CCAAQCACTT AtaU^AACCTA CAATCCTAAT TTTGATGTCC 1740 

ATTGTTAAGA GOTGGTGATA GATACTATTT TTTTTTCATA TTGTATAGOG GTTATTAGAA 1800 



41 51 
I I 

BAGALPNAPN SY6RRPZQVG 60 
SBGVOfflRPP P<a2ALGA»ST 120 
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AAGTTGGGGA TTTTCTTGAT CTTTATTGCT GCTTACOVTT GAAACTTAAC CCACCTGTGT 1860 

TCCCCAACTC TGTTCTGGGC AC3GAAACAGT ATCTGTTTGA GGOITAATCT TAAGTQGCCA 1920 

C3VCACAATST TTTCTCTTAT GTTATCTG6C AGTAACTGTA ACTTGAATTA CATTAGCACA 1980 

TTCIGCTTAO CTAA A ATTO T TAAAA TAAAC TTXAATAAAC OCATGTAGCC CTCTCATTTG 2040 

ATTGACAGTA TTTTA6TTAT TTTTG0C31TT CTTAAAGCTG G6CAATGTAA T6ATCAGATC 2100 

TTTGTTTGTC TGAACAGGTA TTTTTATACA TGCTTTTTST AAAOCAAAAA CTTTTAAAIT 2160 

TCTTCAGGTT TTCTAACATG CTTACCACTQ GGCTACTGTA AATGAQAAAA GAATAAAATT 2220 
ATTTAATGTT TT 



Seq ZD NOi 144 Protein sequence! 
Protein Accession NP_060601 



1 11 21 31 41 51 

I I I I I ) 

MBXQLKDALS KNQQ«9LVY6Q QREVYVXGIOi AKIFELEKKT STAAHSLPQQ TIOCPESEGYL 60 
QEEKOKCm> LIiASAKKDLE VERQTITQLS FKLSEP&RKY EETQKEVHNL NQUiYSQRRA 120 
DVQHLEDDRH KTEKIQKLRE EHDIARGKLE EEKKRSEELL SQVQSLYTSL LKQQEBQTRV 180 
AIiLBQQKQAC TLDFENEKLD RQHVQHQLHV IIiKEIiRKARK NNTVGIl^A S 

Seq ID uOt 145 ENA sequence 
nucleic Acid Accession ft: NH_00116a 
Coding sequence: 50.. 4 78 " 



1 11 21 31 41 51 

I I I I I ) 

CCGCCAGATT TGAATCXXSGG GACCCGTTGG CAGAGGTGGC GSC3G60GGCA TGGGTGCXXX 60 

6ACGTTGCCC CCTGCCTGGC AGCCCTTTCT CAAGGACCAC CXXIATCTCTA CATTCAAQAA 120 

CT6GCCCTTC TTGGAGGGCT G06CCTGCAC C006GAG066 ATG6C06AG6 CT6GCTTCAT 180 

CCACTGCCCC ACTQAGAAOS AGCCAGACTT GGCCCAGTST TTCTTCTGCT TCAAGGAGCT 240 

GGAAGGCTGG GAGCCA6ATQ AOGACCCCAT A6AGGAACAT AAAAA6CATT 06TCX3SGTTG 300 

CGCTTTCCTT TCTGTCAAGA AGCRGTTTGA AGAATTAACC CTTGGTGAAT TTTTGAAACT 360 

GGACA6A6AA A6AGCCAAGA ACAAAATTGC AAAGGAAACC AACAATAAGA AGAAAOAATT 420 

TGAGGAAACT G06AAGAAAG TGCGCCGIQC CATOGAGCAO CTG6CXGCC3^ TGGATTGAGG 480 

CCTCTGG006 GAGCTGCCT6 QTCCCAGAGT GGCTGCACCA CTTGCA6GGT TTATTCCCIG 540 

GTOCCACCAG CCTTCCTOTG GGCCCCTTAG CAATGTCTTA GGAAAGGAGA TCAACATTTT 600 

CAAATTAGAT QTTTCAACTG TGCTCCTGTT TTGTCTTGAA AGTGGCACCA GAGGTX^CTTC 660 

TOCCTGTGCA GOGGGTGCTC CTGGTAACAG TGGCTGCTTC TCTCTCTCTC TCTCTTTTTT 720 

GG6GGCTCAT VIVIMCIWI ' TTG A TTCOOG GOCTTACCAS GTGAGAAGIG AOQGAGGAAG 780 

AAG6CAGTGT CCCTTTT6CT AGA0CTGACA GCTTT6TT0S 08TGGGCAGA 60CTTCCACA 840 

GTGAATGTGT CTOGACCTCA TGTTGTTGAQ GCTGTCACAC TCCTGAGTGT GGACTTGGCA 900 

GGTGCCrOTT QAATCTGAGC' TGCAGOTTCC TTATCTGTCA CACCTGTGCC TCCTCAGAGG 960 

ACAGTTTTTT TGTT6TTGTG TTTTTTTGTT U T lVmm ' GGTAGATGCA TGACrTGTGT 1020 

GTGAT6AGAG AATG6A6ACA GAGTOCCTGG CTCCTCTACT GTmACAAC ATGGCTTTCT 1080 

TATTTT6TTT GAATTOTT A A TTCACAGAAT AOCACAAACT ACAATTAAAA CTAAOGACAA 1140 

AGCCATTCTA AGTCATTGGG GAAA0GGG6T GAACTTCAGO TGQATGAGGA GACA6AATAG 1200 

AGTGATAGGA AGCXTTCTCGC AGATACTCCT TTTGCCACTG CTGTGTOATT AGACAGGCCC 1260 

AGT6AGC0GC GGGGCACATQ CTGGCOGCTC CTCCCTCAGA AAAAG6CAGT GGCCTAAATC 1320 

CTTTTTAAAT GACTTGGCTC GATGCTGTGG GGGACI06CT GQ6CTGCTGC AG60CGTGTG 1380 

TCT6TCAGCC CAACCTTCAC ATCTGTC A OB TTCTCCACAC GGG06A6A6A 0GCA6TC06C 1440 

CCAGGTCXXX GCTTTCTTTG GAGGCAGCAG CTCCXX5CAGQ GCTGAAGTCT GGOGTAAOAT 1500 

GATGGATTTO ATTCX3CCCTC CTCCCTGTCA TAGAGCTGCA GGGTGGATTG TTACAGCTTC 1560 
GCTGGAAACC TCTGGAGGTC ATCT0GGCT6 TTCCTGAGAA ATAAAAAGCC TGTCATTTC 

Seq ID MOt 146 Protein sequence i 
Protein Accession fti MP 001159 



1 11 21 31 41 51 

i I I I t I 

MGAPTliPPAH QPFLroHRIS TPKimPFljEG CACTPERMAB AGFXHCPTEH EFDLAQCFFC 60 

FKELEGVEPD DDPIEEHKKH SSGCAFLSVK XQFEELTI<6B FUCUIRERAK NKXAKETMNK 120 
KKBFBBTAKK VBRAIBQLAA MD 

Seq ZD NO: 147 D2IA sequence 

Nucleic Acid Accession it NM_0l4l7e.l 

Coding sequence: 127-720 

I 11 21 31 41 51 • 

I 1 I I I I 

GCGCGCAGCG CTGGTACCCC GTTGGTCCGC GCX5TTGCTGC GTTSTGAGGQ 6TGTCAGCTC 60 

AGTGCATCCC AGGCAGCTCT TAGTGTGGAG CAGTQAACTO TCTOTGGTTC CTTCTACTTG 120 

G6GATCATGC AGAGAGCTTC ACGTCTGAAO AGAGA6CTGC ACATGT7AGC CACA6AGCCA 180 

CCCCCAGGCA TCACAT6TTG GCAAGATAAA GACCAAATGG ATGACCTGC6 A6CTCAAATA 240 

TTA6GTGGAG CCAACACACC TTAT6AGAAA GGTGTTTTTA AGCTAGAAGT TATCATTCCT 300 

GAGAGGTACC CATTTGAACC TCCTCAGATC CGATTTCTCA CTCCAATTTA TCATOCAAAC 360 

ATTQVTTCTG CTGGAA66AT TTGTCTGGAT GTTCTCAAAT TGCCACCAAA AGGTGCTTGG 420 

AGACCATOOC TCAACATCGC AACTGTOTTG ACCTCTATTC AOCTGCTCAT 6TCAGAACCC 480 

AACCCT6ATG ACC06CTCAT OGCTGACATA TCCTCA6AAT TTAAATATAA TAAGCCAGOC 540 

TTCCTCAAGA ATGCCA6ACA GTGGACAGA6 AAGCATGCAA GACAGAAACA AAAGGCTGAT 600 

GAGOAAGAGA TGCnOATAA TCTACCA6AQ GCIGGTGACT C CAGftGt ACA CAACTCAACA 660 

CAGAAAA06A AGGCCAGTCA GCTAGTAOGC ATA6AAAAGA AATTTCATCC T6ATGTTTA6 720 

GGOICTTGTC CTGGTTCATC TTAGTTAATG TGTTCTTTOC CAAGGTGATC TAAGTTGCCT 7B0 

ACCTTGAATT. TTTTTTTAAA TATATTTGAT QACATAATTT TTGTGTAGTT TATTTATCTT 840 

GTACATATGT ATTTTGAAAT C7TTTAAA0C TGAAAAATAA ATAGTCATTT AATGTTGAAA 900 
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AAAAAAAAAA AAAAAAAAAft AAAAAAAA 



Seg ID NO I 148 Protein -sequence I 
Protein Accession S: NP_054e9S.l 

1 11 21 31 41 51 

I I I i i I 

KQRASRLKRE LHMLATBPPP 6ITCHQDKDQ KD0LRAQIL6 GAHTPYEKGV FKLEVIIPER 60 

ypPEPPQIRF LTprVHPMlD SAGRICLDVL KLPPW3WRP SLNIATVLTS IQLU4SBPIJP 120 

DOPIMADISS BFKmKPAFb KHARQWTEKH ARQKQKADEE EMLDNLPBAG DSRVHKSTQK 160 
RKASQIiVGXS KKFHPIXV 

Seq ID NO I 149 DNA sequence 
Nucleic Add Accession ftt im_003B12 
Coding sequence: 224-2722 

1 11 21 31 41 51 

I I I 1 I I 

TCCTCTGCGT CCOSCCCOGG GAGTGGCTGC GAGGCTAGGC GAGCCGGGAA AGGGGGOGCC 60 

6CCCA6CCCC GAGCCCOGOS CCC06T6CCC CXSAQOCOQGA GCCOCCTGCC 060GGCG6CA 120 

OCATQG8CX3C aSASCCXSOOQ TCSAC06GCIC CSGCOOSOOGC O6CCC0GCAO CTAGCCOGOC 160 

GCTCrOGCOQ GCOVCAOQGA GCG6080C08 GGA6CTATGA GCCA7GAAGC 06(X0GGCAG 240 

CAGCTCOOGG CAGCCGCCCC TGGOGGGCTG CAGCXTTGCC GGOGCTTCCT GOGGCCCCCA 300 

A060GGCCCC GCCGGCTCGG TGCCTGCCAG CX3CCOOG6CC CGCAOGCCGC CCTGCCGCCT 360 

GCTTCTOGTC CTTCTCCTGC TGOCTCOGCT OGCOGOCTOO TCCOGGCCCC GCX3CCT6GGG 420 

GGCTGCT G OQ 0CCA6CGCTC OGCATTQGAA TQAAACZQCA GAAAAAAATT TQQQAGTCCT 480 

GGCAOATGAA GACAATACAT TGCAACAOAA TAOQlGCAGr AATATCAOTT ACAGCAATGC 540 

AATGCAGAAA GAAATCACAC TGCCTTCAAG ACTCATATAT TACATCAACC AAQACTOGGA 600 

AAGCCCTTAT CACX3TTCTTG ACACAAAGGC AAGACACC3VG CAAAAACATA ATAAG6CTGT 660 

CCATCTG6CC CAG6CAA6CT TCCAGATTGA A6CCTTCGGC TCCAAATTCA TTCTTGACCT 720 

CATACTGAAC AATGGTTTGT TGTCTTCTGA TTATOTOGAO ATTCACTAOO AAAATGGQAA 760 

ACCACAGTAC TCTAA6GGTG GAGAGCACTG TTACTAOCAT GGAAGCATCA GA6G0GTCAA 840 

AGACTCCAAG GTGGCTCTGT CAACCTGCAA TGGACTTCAT GGCATGTTTG AAGATGATAC 900 

CTTCGTGTAT ATGATAGAGC CACTAGAGCT GGTTCATGAT GAGAAAAGCA CAGGTCXSACC 960 

ACATATAATC CAGAAAACCT TG6CAG6ACA GTATTCTAAG CAAATGAAGA ATCTCACTAT 1020 

GGAAAGAOGT GACCA6TGGC CCTTTCTCTC TGAATTACAG TGGTTQMAA GAAGQAAQAO 1080 

A5CAGTGAAT CCATCACGTG GTATATTTQA AGAAATGAAA TATTTGGAAC TTATGATTGT 1140 

TAATGATCAC AAAACGTATA AGAAGCATOQ CTCTTCTCAT GCACATACCA ACAACTTTGC 1200 

AAAGTC06TQ OTCAACCTTO TGGATTCTAT TTACAAGGAG CAGOTCAACA CCAGGGTTGT 1260 

CCTGGTGGCT GTAGAGACCT GGACTGAGAA GGATCAGATT GACATCACCA CCAACCCTGT 1320 

6CAGATGCTC CATGAGTTCT CAAAATACOQ GCAGOOCATT AAGCAQCATO CTGATOCTQT 1380 

6CACCTCATC TCGOGGGTGA GATTTCACTA TAAGASAAGC AGTCTGAG7T ACTTTGGAGG 1440 

TGTCTGTTCT CXSCACAAOAO GAGTTGGTQT GAATGAGTAT GGTCTTCCAA TGGCAGTGGC 1500 

ACAAGTATTA TOGCAGAGCC TGQCTCAAAA CCTTGGAATC CAATGGGAAC CTTCTAGCAG 1560 

AAAGCX3UAA TGTGACTGCA CAGAATCCTG GGGTG6CTGC ATCATGGAGG AAACAGGGGT 1620 

GTCOCATTCr OGAAAATTTT CAAAGTOCAG CATTTTGGAO TATAGAOACT TTTTACAOAG 16B0 

AGGA GG TG G A GCCTGCCTTT TCAACAGGOC AACAAAGCTA TTTGAGCCCA 066AAT0TGG 1740 

AAATGGATAC GTGGAAGCTG GGGAGGAQTG TGATTGTGOT TTTCATGTGG AATGCTATGG 1800 

ATTATGCTOT AAGAAATGTT CCCTCTCX3VA CGGGGCTCAC TGCAG06A0G GGCXTTGCTG 1860 

TAACAATACC TCAICTCTTT TTCAGCCACX3 AGGGTATGAA TGCOGGGATG CTGTGAAOGA 1920 

GTGTGATATT ACTGAATATT GTACT6GAGA CTCTQGTCAG T6C0CACCAA ATCITCATAA 19B0 

GCAAGAOGGA TATGCATGCA ATCAAAATCA GGGCCGCTGC TACAATGGOS AGTOCAAGAC 2040 

CA6AGACAAC CAGTGTCAGT ACATCTGGGG AACAAAGGCT GCAGGGTCTG ACAAGTTCTG 2100 

CTATGAAAAG CTGAATACAG AAGGCACTGA GAAGGGAAAC TGCGGGAAGG ATGGAGACCG 2160 

GTQGATTCAG TGCAGCAAAC ATGATGTGTT CTGTGGATTC TTACTCTGTA CCAATCTTAC 2220 

TOGAGCTCCA C33TATTGGTC AACTTCAGGG TGAGATC3Vrr CCAACTTCCT TCTACCATCA 2280 

AGGCOGGGTG ATTGACTGC31 GTGGTGCXXa TGTAGTTTTA GATGATGATA CGGATGTGGO 2340 

CTATGTAGAA GATGGAACGC CATX3TGGC0C GTCTATGATG TGTTTAGATC GGAAGTGCCT 2400 

ACAAATTCAA GCCCTAAATA TGAGCAGCTG TCCACTCGAT TCCAAGGGTA AAOTCTCTTC 2460 

GGGCCATGGG GTGTGTAOTA ATGAAGCCSU: CTGCATTTGT GATTTCACCT GGGCAGGGAC 2520 

AGATTGCAGT ATCOQGQATC CAGTTAGGAA CCTTCACCCC CCCAAGGATG AAGGACCCAA 2580 

GGGTCCTAGT GCCACCAATC TCATAATAGG CTCCATOSCT GGTGCCATCX: TGGTAGCAGC 2640 

TATTGTCCTT GGGGGCACAO GCTGGGGATT TAAAAATGTC AAGAAGAGAA GOTTOGATCC 2700 

TACTCAGCAA GGCCCCATCT GAATCAGCTG CGCTGGATQG ACACCGCCTT GCACTGTTQG 2760 

ATTCTQG6TA TGACATACTC GCAGCAGTGT TACTGGAACT ATTAAGTTTG TAAACAAAAC 2820 

CTTTG6GTGG TAATGACTAC GGAGCTAAAG TT6GGGTGAC AAGGATGGGG TAAAAGAAAA 2880 

CTGTCTCTTT TGGAAATAAT GTCAAAGAAC ACCTTTCACC ACXTTCTCAGT AAAOSGGGGA 2940 

GGGGGCAAAA GACCATGCTA TAAAAAGAAC TGTTCCABAA TCTTTTTTTT TCCCTAATQO 3000 
A0GAAG6AAC AACACACACA CAAAAATTAA ATGCAATAAA GGAATGATTA AAAA 



Seq IP NO: 150 Protein sequence: 
Protein Accession ft: NP_003803 

1 11 21 

I I I 

MKPPGSSSRQ PPLAGCSLAG AS06PQRGPA 
RPRAHGAAAF SAPRHNETAB lOiLGVLADED 
mODSESPYB VU)TKARBQQ XHKKAVHLAQ 
RYENGKPQYS fOGGEHCYVBG SIRGVKDSKV 
KSTGRPHIIQ KTIiAGQYSKQ MKNLTMERGD 
LELMIVNDHK TYKKHRSSHA BTMNFAKSW 
ITTHFVQMLH EFSRXRQSaK QHAZSAVHXiXS 
LPMAVAQVLS QSLAQNLGIQ HEPSSRKPKC 

RDFXjQRGGGA clfrrptklf bpteogsgyv 

SPG PCqW TS OiPQPRGyEC RDAVNBCDIT 
NGBCKTRDNQ OQYIWGTKAA GSDKFCYEKL 



31 41 51 

I I 1 

GSVPASAFAR TPPCRLLLVL LLLPPLAASS 60 

imOWSSSN ISYSMAMQKE ITLPSRLIYY 120 

ASFQIBAFGS KFIXi)LII<NK GU.SSDYVEI 180 

ALSTCKGiaG MFEDDTFVyH ISPLELVHDE 240 

QHPFLSEIiQW IiKKRKRAVNP SRGIFEEa4KY 300 

NLVDSIYKEQ UITSWLVAV BTHTEKDQID 360 

RVTFEnCRSS LSYFGGVCSR TRQVGVHBYG 420 

DCTESWQ6CX NEETGVSSSR KPSI^II^Y 480 

EAGEBOX^GP HVBCYGLCCR KCSLSNGABC 540 

EYCZCTSGQC PPNLBKQDGY AOZONQGROr 600 

NTBGTEXGNC GKXXSDfiWIQC SKHDVFOGFL 660 
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IiCnaiTHAPH iGQIiQGEIIP TSFYHQGRVI DCSGAHWU) DDTDVGWED GTPOGPSMHC 720 

IDKKCLQIQA LNMSSCPLDS KGKVCSGHGV CSNEATCICD FTWAGTDCSI RDPVRNXaPP 780 
KDEGPKGPSA TVLIIGSIAG AZZiVAAIVLO GTGHGFKNVK KRSFDFTQQ6 PZ 

Seq ID MO: 151 mA sequence 
Nucleic Acid AccCBSion ft: KM_023915 
Coding sequence: 250-1326 

1 11 21 31 41 51 

1 I i I I 1 

GGCAOSAGGG mVSTm C ATGCTTTAOC AGAAAATCCA CTTOCCTGOC GAOCTTAGTT 60 

TCAAAGCTTA T TCTT A ATTA GAGACAAGAA ACCTGTTTCA ACTTGAAGAC ACOQTATGAG 120 

QTGAATGGAC AGCCAGCCAC CACAATGAAA GAAATCAAAC CAGGAATAAC CTATGCTGAA 180 

CCCAOGCCTC AATOGTCCCC AAGTGTTTCC TGACAOGCAT CTTTGCTTAC AGTGCATCAC 240 

AACTGAAGAA TG6G6TTCAA CrTQAOGCTT GCAAAATTAC CAAATAACGA GCT6CA06GC 300 

CAAGAGAGTC ACAATTCAGQ CAACAGGA6C GAOQGGCCAG GAAAGAACAC CACOCTTCAC 360 

AAT6AATTTG ACACAATTGT C TfGaXa/ i' G CTTTATCTCA TTATATTTOT GGCAAGCATC 420 

TTGCTGAATG WX'A GCAGT GTGGATCTTC TTCCACATTA GGAATAAAAC CAOCTTCATA 4B0 

TTCTATCTCA AAAACATAGT GGTTGCAGAC CTCATAATGA OGCTGACATT TCCATTTCXSA 540 

ATAGTCCATG ATGCAGGATT TGGACCTTGQ TACTTCAAGT TTATTCTCTG CA6ATACACT 600 

TCA6TTTTGT TTTATGCAAA CATGTATACT TCCATQGTGT TCCTT6G6CT GATAAGCATT 660 

6AT06CTATC TGAAGGT66T CAAGCCATTT GGGGACTCTC GGATGTAC3^6 CATAACCTTC 720 

AC3GAAGGTTT TATCTGTTTG TGTTTOGGTG ATCATGGCTG Tm-GTCTTT OCCAAACATC 780 

ATCCTGACAA ATGGTCAGCC AACAGAGGAC AATATCCATG ACTGCTCAAA ACTTAAAAGT 840 

CCTTTGGGGG TCAAATGGCA TACGGCABTC ACCTATGTCA ACAGCTGCTT GTTTGTGGCC 900 

GTGCTGGT G A TTCTGATOOG ATGTTACATA GCCATATCCA 6GTACATCCA CAAATGCAGC 960 

AGGCAATTCA TAAGTCA6TC AAGC06AAAG OGAAAACATA ACX»GA6CAT CAOGGTTGTT 1020 

GTGGCTGTCT TTTTTACCTG CTTTCTACCA TATCACTTGT GCAGAATTCC TTTTACTTTT 1080 

AGTCACTTAG ACAGGCTTTT AGATGAATCT GCACAAAAAA TCCTATATTA CTGCAA AGAA 1140 

ATTACACTTT TCTTGTCTGC GTGTAATGTT TG0CTG6ATC CAATAATTTA CTTTTTCATG 1200 

TGTAGGTCAT TTTCAAGAAG GCTGTTCAAA AAATCAAAXA TCAGAAOCAa GAGIGAAAGC 1260 

ATCAGATCAC TGCAAAGTGT OAGAAGATGO GAAOTTGQCA tATATTATGA TTA CACTG AT 1320 

GTG7AGGCCT TTTATTGTTT GTTGQAATGO ATATGTACAA AGT6TAAATA AATGTTTCTT 1380 
TTCATTATOC TTAAAAAAAA AA 



Seq ZD KO: 152 Protein sequence: 
Protein Accession Si NP_076404 

1 11 21 31 41 51 

I I I I I I 

MGEEOiTLAKl. PNNEIiBGQES BKSGNRSDGP GKNTTIAMBF DTIVLPVLYL IZFVASZLLH 60 

GIAVWIPFHI RNKTSFIFYL KNIWADLIM TLTFPPRIVH DAGFCPWYPK PILCRYTSVL 120 

FYANMYTSIV PLGLISIDRY LKWKPPGDS RMYSITFTKV LSVCVWVIMA VliSLPNIIbT 180 

NGQPTEWIIH DCSKLKSPLG VKMHTAVTYV NSCLFVAVLV ILIGCYIAIS RYIHKSSRQP 240 

ISQSSRKRKH NQSIRWVAV FPTCPLPYHI. CRIPFTPSHL DRLLDESAQK ZZjYYCKEjni 300 
PLSACNVCLD PZIYFFMCRS PSRRLFKKSN IRTRSESZRS LQSVRRSEVR lYYDYTDV 

Seq ID NOx 153 DKA sequence 
Nucleic Acid Accession 0: D80008.1 
Coding sequence: 149-739 

1 11 21 31 41 51 

ill)!) 

GTTCGGOGCC AAAQCGOGGA GCGGAGGCOG AGGCGAGAGC CTGGOGCTGT AGQACTAOAA 60 

C6AAAGGAGT GAGGCGCOGA GAGCCCAGAT ACCATTTTGG CGTGAGAGCT GGTGGTTGGC 120 

AAGOCOGOGG GAGTGGGAAG CGTC06CCAT GTTCTGOGAA AAAGCCATGG AACTGATCOG 160 

QGAGCIGCAT 0G0G0GCCXX3 AA6GGCAACT GCCTOCCTTC AAOGAGGATG GACTCAGACA 240 

AGTTCTGGAG GAGATGAAAG CTTTGTATGA ACAAAACXAQ TCTGATGTGA ATGAAGCAAA 300 

6TCAGGTGGA CGAAGTQATT TGATACCAAC TATCAAATTT 0GAC3VCTGTT CTCTGTTAAG 360 

AAATOSACXSC TGCACTGTAG CATACCTGTA TGACCGCTTG CTTOGGATCA GAGCACTCAG 420 

AT5QGAATAT GGTAGCGTCT TGCCAAATGC ATTACGATTT CACATGGCTG CTGAAGAAAT 460 

GGAGTOGm AAIAA7TATA AAAGATCTCT TGCTACTTAT ATGAQGTCAC TGGGAGGAGA 540 

TGAAGGTTT6 GACATTACAC AOGATATGAA ACCACCAAAA AGCX^'ATATA TTGAAGTCOG 600 

GTGTCTAAAA GACTATGGA6 AATTTGAAGT TQATGATGGC ACTTCAGTCC TATTAAAAAA 660 

AAATAGCCAO CACTTTTTAC CrOSATGGAA ATGTGAGCAG CTGATCAGAC AAGGAGTCCT 720 

6GAGCACATC C7GTCATGAC CAT6C6G0GA GGCACTTCCA GGCTTCAGTC AACTCATQGA 760 

C T OCTCTGTA CrCACTCTCT OCACCACTOC CTTCACCICC CICTTTGATT TTAOAAGCTA 840 

TAGAGATTGT TTAA6ATAAC TAAGAATACT T66CTAA6AA OTATAATTTO CTAACTATTA 900 

AGGACTTTCT TTTTTTAATG TTGTACACTA TTCTTCCTAC TCTrrrTTGG TTTTGGTTTT 960 

GTTTTGTAGA QACTOTCTCA CTATGTTOCC CAAGCTGGTC TCAAACTCCT GGCCTCAAGC 1020 

AGTOCTCCCA CCTTAGCTTC TCAAAGTGTT GAGATCACAG GOGTGAGCCA CT6CACCGQG 1080 

CCCCTACTCC TTTTTCTAAT AA6CT6TATC TGTAATCACA GCATTCCTAC AGTTGTTACA 1140 

Gmmrmi ' taaatqaaag taaacaiggt tacatttgaa tctcttaaat aagcaotcac 1200 

TTGGCTGGAC A66AA6AAGG TAGATC3CTGT GTGTCTTOTT TTCTGGTCAT GTGTATTGTA 1260 

CAAGCTAGAG AGCTGAATTT CTGAGATACA CATTTTCAAA TCACATGCAA GTGAAGATGA 1320 

TGQTCTGTAG AAATTTTCAG TATATATAAT GTTTAATGAC ATACTAATTT ATCATCTGGC 1380 

TATTTGGGAA 6GAAGQACAC ACATG6ATTT TGCACATTTC GAOCATGGTG 6CIGGTGTG6 1440 

CTTGTGGCTA T6G6GT6ATC AOCAGTATCA CX»CTTTQ6A A6G6GACAGT GAAA3TSG6G 1500 

CTAGAGAAGG AACTTTGTAC AOTTTTCOCT GAGATTCAta TTCACTGAAA AGTO«3lT6A 1560 

AGAGTTGATT GTCTTTTAAT GGTATGTTTT AAACAGCTGA CATTTTAAAT TTTGATGAAA 1620 

TCCAGTTTAT TOGTTTGTTC TTTTATGCTT TGGGTGTTGC ATCO CSAGAAA TCTTTTCCCA 1680 

TCCCAAGATC ACAATTTTTT TTCCTTTTTA CTTCTAGAAG TGTTATAATT TTAA6CTTTA 1740 

TACTTTGGTC TATGACCOGT TTTTTTTTTT GTTTTGTTTT GTTTTTTOGT TTOTTTCTTT 1800 

GTTTT6AGAT GGAGTCTTGT TCTGTCACCC AGGCTSGOGT GCAOZGGCGT GATCTTGGCT 1860 

CACTGCAATC TCTATCCCCT GGGTTCAAGT GA3TCTCTTG TCTCAGCCTC CCAAGTAGCT 1920 

GGGATTACAG GCACAGGCC8 CCA06CCTG6 CTAATTTTTG TATTTTTAGT AGAGACAGAG 1980 
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TTTTACCATG TTGGCCAGGC TGGTTTCAAA CTCCTGACCT CAAGTGACCC ACCTTGGCCT 2040 

CCCAAAGTTT TGGGATTACA AGTGTGGGCC ACOSOGGCCA GCCTATGATC CATTTTGAAT 2100 

GAATTTTTTA TATGCTTGCAA GGTGTCAATC CACCTTCACT TTTTCTTGGG AATATAGATA 2160 

TCCAGCTGTT TCACTAOCAT TTTTTGAAAG GACTGOCCTT TGCXCTATCA CCTTTGCKTT 2220 

TTTGTTAAAA AGTAGTTGTC AATGTATAT6 TGGGTTTATT TCAGGACTCT GTTTTGTTCC 22 BO 

ATTGACCTGT TTTTCTCTCC TGAATGCCRA TACCATATTT GTATGTAGTG TATGTAATTT 2340 

TCTAATAATT CTTGAAACAG ATAGTATTAA TGTGTCATAT TTTTGCTOTT GTTTGTATTT 2400 

TTTGTAGAGA TGGGGTTTCA CCGfTGTTGGC CAGGCTGTGT TGAACTCCTG AGCTAAAGCA 2460 

ATACACTTGC CTCGTCCTCC CCATGTGCTG GGATTACAGG CGTGAGCCTT GGTGCXGGCC 252 0 

CAGTOTACCA CATTTCTTTT TGAGATTTGT TTTGGCTATG TTAAGTCCTT TGCTTTTGAT 2S80 

6TGAAATTT6 GGAACAGGCA GG G TGT G GTG GCTTATGCCT GTAATCCTAG AACTTTGGGA 2640 

GGOCTAGATG GGTGGATCAC TT6AGCTCA6 GAOTTCCAQA CCA6CCC6GG CCTATGGCAA 2700 

AACTCOCTCT CTACAAAAAA TAGAAAAAAT TAGCCAGGTO TGGTGCTGCA TGCCTGTAGT 2760 

CACAGTTACA OGGCAGGCTG AGGTGGGAGG ATCACTTGAA CCCCAGAGGT CAAGACTGCA 2820 

<3TGAGCTGA0 ATCACAOCAC TGTACTCCAG CCTOG6TGAC AAAGTGAGAC TCTATCTCAA 2880 

AAAGAAATTA GGATC3UITTT GTCAATTTCT ACAACAACAA CAACAAAAAC CCCIGTT6GG 2940 

CAOCTTGATT GASATTGCAT TGAATTTATA TAAAACTCfTT GGGAGAATTG ACATCTTAAT 3000 

AATATTSAGT CTTCTGGCCT ATAAACAAGG TCTGTCrTCC TAGGTATTAA TGTTTTGTCT 3060 

TCTATTTCTC TTAATAATCT TTTGTAGTTT TCAGTGTACA GGTCTACCAT GTCAGCATTI 3120 

CATAGTTTTG AT6CTAAATG GTATTTTAAA ATTTCAAATT CTAACCACTT GTTGCTAGTA 3180 

AATAGAAATA CAATTGATGT T6AACTTGTA TOCTTCAGCC TTGCTAAACT GIGAGTTCTC 3240 

ATGGTGTTTT TGTAAATTAC ATCAACAGTC ATGTGTTCTA T6AATAAAGA GTTTTACTCC 3300 
TTC 

Seq ID NO: 154 Protein sequence: 
Protein Accession #i BAA11503.1 

1 11 21 31 41 51 

1 i I I I i 

MFCEKAMELZ RELBRAPEGQ LPAFMEDGLR QVLEEMKALY EQNQSDVNEA KSGGRSDIiIP 60 

TIKFRHCSLL HNRRCTVAYIi YDRXiLRIRAL RHEYQSVLPN ALRFHKAAEB KEHFKMYKRS 120 

LATYMRSLG6 DE6LDITQDM KPPKSLYZBV RCLRE3YGEPB VDDGTSVIiXiK KZrSQBFLPRW 180 
KCSQLIRQGV LEHILS 

Seq ID NO: 155 DKA sequence 

Nucleic Acid Accession #t Eos sequence 

Coding sequence) 149-709 

1 11 21 31 41 51 

I I I 1 1 I 

GTT0GG06CC AAAGGG0G6A G0GGA6GC0Q AG60GAGAGC CTGGOGCTGT AG6ACTA6AA 60 

OGAAAGGAGT GAOG06CXX3A GAOCCCAGAT ACCATTTTG6 OGTGAGAGCT GGTG G TT G GC 120 

AAGGCOGCGG GAGTGGGAAG CGTCC3GCCAT GTTCTGCGAA AAAGCCATGG AACTGATCCX5 180 

CGAGCTGCAT CGOGCGCCCG AAGGGCAACT GCCTGCCTTC AACGAGGATG GACTCAGACA 240 

AGrrCTGGAG GAGATGAAAG CTTT6TATGA ACAAAACCAS TCTGAT6TGA ATGAAGCAAA 300 

GTCAGOTOGA OGAAGTGATT TGATACCAAC TATCAAATTT GGACACTGTT CTCTOTTAAG 360 

AAAT06ACX3C TGCACT6TAG CATACCTGTA TGACCGCTTG CTTCOGATCA GAGCSkCTCftG 420 

ATGGGAATAT GGTAGOGTCT TGCCAAATGC ATTAOGATTT CACATGGCTG CTGAAGAAAT 480 

GGAGTGGTTT AATAATTATA AAAGATCTCT TGCTACTTAT ATGAGGTCAC TGGGAGGAGA 540 

TGAAGGTTTG GACATTACAC AGGA7ATGAA ACCACCAAAA AGCC7ATATA TTGAAGC7GG 600 

ATGCAGTGGC GGGATCTCGG CTCAACCTGC AACCTCCACC TOCCAOGTTC A0CTCAACT6 660 

CAACCTCCAC CTCCCAGGTC OGGTGTCTAA AAGACTATGO AGAATTTGAA GTTGATGATG 720 

GCACTTCAGT CCTATTAAAA AAAAATAGOC T^GCACTTTTT ACCTCGATGG AAATGTGA6C 780 

AGCTGATCAG ACAAGGAGTC CTGGAGCACA TCCTGTCATG ACCATGCGCC GAGQCACTTC 840 

CAGOCTTCAC TCAACTCATG GACTCCTCTG TACTCACTCT CTCCACCACT CCCTTCACCT 900 

CCCTCTTTGA TTTTAGAAGC TATAGACATT GTTTAAGATA ftCTAAQAATA CTTGGCTAAfi 960 

AAGTATAATT TGCTAACTAT TAAGQACTTT CTTTTTTTAA TGTTGTACAC TATTCTTCfCT 1020 

ACTCTTTTTT GG ' lTn 'GGTT TTGTTTTGTA GAGACTGTCT CACTATGTTG CCCAA6CTGG 1080 

TCTCAAACTC CTQGCCTCAA GCAGTCCTCC CACCTTAGCT TCTCAAAGTG TTGAGATCAC 1140 

AGGCGTGAGC CACTQCACCC GGCCCCTACT CXTTTTTTCTA ATAAGCTGTA TCTOTAATCA 1200 

CAGCATTCX:T ACAGTTGTTA CAGTGTGTTT TTTAAATGAA AGTAAACATG GTTACATTTO 1260 

AATCTCTTAA ATAAGCAGTC ACTTG6CTG0 ACAGGAAGAA GGTAGATCCT GTGTGTCTTG 1320 

TTTTCTGGTC ATGTGTATTG TACAAGCTAO AGAGCTGAAT TTCTGAGATA CSICATTTTCA 1380 

AATCACATQC AAGTQAAGAT GATQGTCTGT AGAAATTTTC AGTATATATA ATGTTTAATG 1440 

ACATACTAAT TTATCATCTQ GCTATTTGGG AAGGAAGGAC ACACATGGAT TTTOCACAIT 1500 

TCCACCATGG TGGCTGGTGT GGCTTGTGGC TATGGGGTGA TCACCAGTAT CAOCACTTTG 1560 

GAAGGGGACA GTGAAATTGG GGCTA6AGAA GGAACTTTGT ACAGTTTTCC CTGAGATTCA 1620 

GATTGACTGA AAAGTCACAT QAAGAGTTGA TT6TCTTTTA ATGGTATGTT TTAAACAGCT 1680 

GACATTTTAA ATTTTGATGA AATCCAOTTT ATTCGTTTGT TCTTTTATGC TTTGGGTGTT 1740 

GCATCCGAGA AATCTTTTCC CATCCCAAGA TCAC3UITTTT TTTTCCTTTT TACTTCTAQA 1800 

AGTGTTATAA TTTTAAGCTT TATACTTTGG TCTAT6ACCC GTTTTTTTTT TTGTTTTGTT 1660 

TTGTTTTTTC GTTTGTTTCT TTGTTTTGAG ATGQAGTCTT GTTCTGTCAC CCAOOCTQGG 1920 

GTGCAGTGGC GTGATCTTGG CTCACTGCAA TCTCTATCCC CTGGGTTCAA GTGATTCTCT 1980 

TGTCTCAGCC TCCCAAGTAG CTGGGATTAC AGGCACAGGC CGCC3^CGCCT GGCTAATTTT 2040 

TQTATTTTTA GTAGAGACAG AGTTTTACCA TGTTGGCCAG GCTGGTTTCA AACTCCTGAC 2100 

CTCAAGTGAC CCACCTTGGC CTCCCAAAGT TTTG6GATTA CAAGTGTGGG OCACOGCGGC 2160 

CAGCCTATGA TCCATTTT6A ATGAATTTTT TATATGGTOC AAGGTCTOU TCCACCTTCA 2220 

CTTm ' CTi ' G GGAATATAGA TATCCAGCTG TTTCACTACC ATTTTTTGAA AGGACTGCCC 2280 

TTTGCTCTAT CACCTTTGCA TTTTTGTTAA AAAGTAGTTG TCAATGTATA TGTGGGTTTA 2340 

TTTCAGGACr CTGTTTTGTT CCATTGACCT GTTTTTCTCT CCTGAATGCC AATACCATAT 2400 

TTGTATGTAG TGTATGTAAT TTTCTAATAA TTCTTGRAAC AGATAGTATT AAT6T6TCAT 2460 

ATTTTTGCTS TTGTTTGTAT TTTTTGTA6A GATGGGGTTT CAC0GTGTT6 6CCAGGCTGT 2520 

GTTGAACrOC T6AGCTAAA6 CAATACACTT GOCTOGTOCT CCXXKTOTQC TGOQATTAOV 2580 

GG0GT6A00C TTGGTGCTGG CCCA6TGTAC CACATTTCTT TTT6A0ATTT GTTTTGGCTA 2640 

TGTTAAGTOC TTTGCTTTPG ATGTGAAATT TGG6AACAQQ CAGGGTGTGQ T6GCTTATGC 2700 

CTGTAATCCT AGAACTTTGG GAGGCCTAGA TGGGTGGATC ACTTGAGCTC AGGA6TTCCA 2760 

GACCA6CC0G G6CCTATGGC AAAACTCCGT CTCTACAAAA AATAGAAAAA ATTAGCCAGG 2820 
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TGTGGTGGTG CATGCXTCTA GTCACRGTTA CAOSQCAGGC TGAQGTQGGA GGATCACTTG 
AACCCCACAG STC3UU3ACIG CASTGAGCTG AGATCACACC ACTGTACTCC AGCXTOGCTG 

ACMAonsu: acictatctc mmaagaaat tmsgaxcmt TTsro wrrr ctacaacaac 

AACMCAAM AOCOCTg f TO GGOOCTTQA naAOATTiGC ATIGAATTTA TATMAACTO 
TTOeGAQAAT VQACATCTTA ATAA1ATT6A eTCTTCIOGC CmAAACAA GOTCTSTCIT 

CCTWSGTATT AATGTTTTGT CTTCTATTTC TCTTAATAAT CTTTrGTACT TTTCAGTGTA 
CRGGTCTACC ATGTCRGCAT TTGATAGTTT TGMCCTAAA TGGTATTTTA AAATTTCAAA 
TTCTAACCAC TTGTT6CTAG TAAATAGAAA TACAATTGAT GTTQAACTTG TAT CCUTCA O 
CCTTQCTAAA CTGTGAGTTC TCATGGTSTT 1TT0TMATT ACATCAACAG TCAT6TQTTC 
TATGAA'CAAA GAGTTTTACT CCTTC 

Seq ID HO: 1S6 Protein sequence i 
Protein Accession t> Eos sequence 

1 11 21 31 41 SI 

I I I I i I 

MPCEKAHELI RBLKRAPEGQ ItPAFMSDGLR QVLEEMKALY BQNQSPVNE/V KSGGRSDLIP 
TIKFRHCSLL RNRRCTVAYli YDRLUIXRAL RHEyCSVLPN ALRFHHAAEE MEHFMNYXRS 
LATYMRSLGG DBSLDXTQDM KPPKSIiYIEA OCSGAISAQP ATSTSQVHLN CNLHXjPGPVS 

Seq ZD NO I 157 DMA sequence 

NUclelc Acid Accesalon #i Eos sequence 

Coding sequence: 148-621 



TTCGGCGCCA 
GAAAGGAGT6 
AGGCC6C6GG 
GAGCTGCATC 
GTTCTGGAGO 
TCAGGTGGAC 
AATCGACGCT 
TG6GAATATG 
C6GTGTCTAA 
AAAAATAGCC 
CTGGAGCACA 
GACrCCTCTG 
TATAGACATT 
TAAGGACTTT 
TTGTTTTGTA 
GCAGTCCTCC 
GGCCCCTACT 
CACTGTGTTT 
ACTTGGCTGO 
TACAAGCTAG 
C3ATGGTCTGT 
GCTATTTG6G 
GGCTTGTGGC 
OGCTA GAGA A 
GAAGAGTTGA 
AATCCASTTT 
CATCCCAAGA 
TATACTTTGG 
TTGTTTTGflXS 
CTCACTGCAA 
CTGG6ATTAC 
AQTTTTACCA 
CTCCCAAAGT 
ATGAATTTTT 
TATCCA6CTG 
TTTTTGTTAA 
OCATTGACCT 
TTTCTAATAA 
TTTTTGTAGA 
CAATACACTP 
CCCAGT6TAC 
ATQTGAAATT 
6AG6CCTAGA 
AAAACTCaST 
6TCACAGTTA 
CASTGA8CTQ 
AAAAAGAAAT 
GGCACCTTGA 
ATAATATTGA 
CTTCTATTTC 
TTCATAGTTT 
TAAATAGAAA 
TCATGGT6TT 
CCTTC 



11 
I 

AA6C6GGGAG 
AGGOGCCGAG 
AGTGGGAAGC 
GGGGGCCCGA 
AGATGAAAGC 
GAAGTGATTT 
GCACTGTAGC 
6TAGCGTCTT 
AA6ACXAT6G 
A6CACTTTTT 
TCCTGTCATG 
TACTCACTCT 
GTTTAAGATA 
CTTTTTTTAA 
6AGACT6TCT 
CACCTTAGCT 
CCTTTTTCTA 
TTTAAATGAA 
ACAGGAAGAA 
A6AGCTGAAT 
AGAAATTTTC 
AAGGAAGGAC 
TATGGGGTGA 
GGAACTTTGT 
TTOTCTTTTA 
ATTCGTTTGT 
TCACAATTTT 
TCTATGACCC 
ATGGAGTCTT 
TCTCTATCCC 
A0GCACAG6C 
TGTTQGCCAO 
TTTGGGATTA 
TATATGGTGC 
TTTCACTACC 
AAAGTAGTTO 
GTTTTTCTCT 
TTCTTGAAAC 
GATGGGGTTT 
GCCTCGTCCT 
CACATTTCTT 
TGGGAACAGO 
TGG6TGQATC 
CTCTACAAAA 
CACGGCAGGC 
AGATCACACC 
TAGGATCAAT 
TTGA6ATTGC 
GTCTTCTGGC 
TCTTAATAAT 
TGATGCTAAA 
TACAATTGAT 
TTTGTAAATT 



21 
I 

CGGAGGCOGA 
AGCCCAGATA 
OTCCGCCATG 
AGQGCAACT6 
TTTGTATGAA 
GATACCAACT 
ATACCTGTAT 
GCCAAATGCA 
AGAATTTGAA 
ACCTOGAT(^ 
ACCATGOGCC 
CTCCACCACT 
ACTAAGAATA 
TSTTGTACAC 
CACTAT6TT6 
TCTCAAAGTG 
ATAAGCTGTA 
AGTAAACATG 
OGTAGATCCT 
TTCTGAGATA 
A6TATATATA 
ACACAT6GAT 
TCACCAGTAT 
ACAGTTTTOC 
ATGGTAT6TT 
TCTTTTATGC 
TTTTCCTTTT 
GTTTTTTTTT 
GTTCTGTCAC 
CTGGGTTCAA 
CGCCACGCCT 
GCTGGTTTCA 
CAAGTGTGGG 
AAGGTGTCAA 
ATTTTTTGAA 
TCAATGTATA 
CCTGAATGCC 
AGATAGTATT 
CACCGTGTTG 
CCCCATGTGC 
TTTQAGATTT 
CAGGGTGTGG 
ACTTQAGCTC 
AATA6AAAAA 
TGAGGTGGGA 
ACTGTACTCC 
TTGTCRATTT 
ATTGAATTTA 
CTATAAACAA 
CTTTTGTAGT 
TGGTATTTTA 
GTTGAACTTG 
ACATCAACAG 



31 
I 

GGOGAGAGCC 
CCATTTTGGC 
TTCTGOGAAA 
OCTOOCTTCA 
CAAAACCA6T 
ATCAAATTTC 
GACCGCTTGC 
TTAOGATTTC 
GTTGATGATG 
AAATC?TGA6C 
QAGGCACTTC 
CCCTTCACCT 
CTTG6CTAA6 
TATTCTTGCT 
CCCAAGCTG6 
TTGAGATCAC 
TCTGTAATCA 
GTTACATTT6 
OTG'IU'i 'L Trtf 

cacattttca 

ATGTTTAATG 
TTTGCACATT 
CACCACTTTG 
CTGAGATTCA 
ttaaacagct 
TTTGGGTGTT 
TACTTCTAGA 
TTGTTTTGTT 
CCAOGCTGGG 
GTGATTCTCT 
GGCTAATTTT 
AACTCCTGAC 
CCACCGCGGC 
TCCACCTTCA 
AGGACTGCCC 
TGTGGGTTTA 
AATACCATAT 
AATGTGTCAT 
GCCAGGCTGT 
TGGGATTACA 
GTTTTQQCTA 
TGGCTTATGC 
AGGAGTTCCA 
ATTAGCCAGG 
GGATCACTTG 
AGGCTOGGTG 
CTACAACAAC 
TATAAAACTG 
GGTCTGTCTT 
TTTCAGTGTA 
AAATTTCAAA 
TATCCTTCAG 
TCATGTGTTC 



AACCCCASA6 
ACAAAGTGA& 
AACAACAAAA 
TT6GGAGAAT 
CCTAGGTATT 
CAGGTCTACC 
TTCTAACCAC 
CCTT6CTAAA 
TATGAATAAA 



51 
I 

GGACTAGAAC' 
GTGGTTGGCA 
. ACTGATCCGC 
r ACTCAGACAA 
TGAAGOUUIG 
TCT6TTAAGA 
AGCACTCAGA 
TGAAGAAGTC 
CCTATTAAAA 
ACAAGGAGTC 
TCAACTCATG 
TTTTAGAAGC 
TOCTAACTAT 
' GGTTTTGGTT 
CT660CTCAA 
CACTGCACCC 
ACAGTTGTTA 
, ATAA6CA6TC 
I AT6T6TATTG 
' AA6TGAAGAT 
TTATCATCTG 
TGGCTGGTGT 
, 6TGAAATTGG 
A AAGTC ACAT 
ATTTTGATGA 
AATCTTTTCC 
TTTTAAGCTT 
GTTTGTTTCT 
! GT6ATCTTGG 
' TCCCAAGTAG 
GTAGAGACAG 
CCACCTTGGC 
TCCATTTTGA 
6GAATATAGA 
CACCTTTGCA 
CTGTTTTGTT 
T6TATGTAAT 
TTGTTTGTAT 
TGAGCTAAA6 
TTGGTGCTGG 

AGAACrTTGG 
GGCCTAT6GC 
CATGCCTGTA 
6TCAAGACTG 
ACTCTATCTC 
ACOCCTGTTG 
TGACATCTTA 
AATGTTTTGT 
ATGTCAGCAT 
ttgttgctag 
CTGTGAGTTC 
csvgttttact 



2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
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60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 



Seq ID KO: 158 Protein sequence: 
Protein Accession fit Eos sequence 

1 11 21 31 

1 I I I 



41 

I 



51 
1 
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MFCEKANELI KEI£RAPEGQ I.PAFHEDGLR QVIfSfKAliY GCSIQSDVNEA KSGGRSDLZP 60 

TZKFRBCSLL RNmCTVAYL YDRLLRIRAL SHEYGSVbPN ALRFBMAABB VROJEDIYGBP 120 
HVDD6TSVLL KKHSOHFLPR HKCEQLIRQG VZiEBZLS 

Seq ID NO: X59 SNA sequence 

Nucleic Acid Accession #: Bos sequence 

coding sequences 149-229 

1 11 21 31 41 51 

I I I I I I 

GTTCGGOSCC AAAGOGOGGA G06GA0600G A060GAGA6C CTGG06CTGT AGGACTAGAA 60 

0GAAA6GA6T 6AG606C06A GAGCCCAGAT ACCATTTT6G OGTGAGAGCT GGTG6TTG6C 120 

AAGGCCXSOSG GAGTGGGAAG OGTCCGCCAT GTTCTGCGAA AAAGCCATGG AACTGATCOG 180 

CGAGCTGCAT OGCGOGCCCG AAGOGCAACT GCCTGCCTTC AACAATTAGC TGGGTGTGGT 240 

GGCACACACC TGTAGTCCCA GCAACTTA6G AG6CTGAA6T GAGAGGA7T6 CATGGCTCCA 300 

GGAAGTTGAA ACTGCAGTGA ACT6T6GTCA CGCTATTACA CTCCAGCCTG GGTGACAGAC 360 

TGAIVTCCCIG TCTCAAAAA6 GAAAA6GA66 ATGGACTCAO ACftAGTTCTG GAGGAOATSA 420 

AAGCTTTGTA TGAACAAAAC CAGTC7GATG TGTTCTCTGT TAAGAAATCX3 A0QGT6CACT 480 
GTAGCATACC T6TATGAC06 CTTGCTTCGG ATCAGAGCAC TCAGATGO 

Seq ID HOi 160 Protein sequence: 
Protein Accession #: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

ATGTTCTGCX3 AAAAAGCCAT GGAACT6ATC 06CGAGCTGC AT060G060C 0SAAGG6CAA 60 

CTGCCTGCCT TCAACAATTA G 

Seq 10 NO: 161 DNA sequence 
Nucleic Acid Accession fft U10694 
Coding sequence t 1333 -22 BO 

1 11 21 31 41 51 

I I I I I I 

GGATCOGGCC GGATCTCAGG GAGGTGAGGA CTTTGTTCTC AGAGGGTGTC TGTGGACAAA 60 

ACAGGGAGGC CCTGTGTTCX5 ACAQACACAQ TGGTCCCAGG ATTGGAGAGC AGTCCAGGTO 120 

AG6AA0CTAA GGGAGGATOG A66GTACCTC CAGGCCAGAG AAACTCTCAG ATCAAGAGAG 180 

TTTGOOCTGC OCCTACTGTC ACCCCAGAGA GCC06GGCAG GGCTGTCTGC T6AGGTCCCT 240 

CCT7TAT0CT GOQATCACTG GTGTGGGGGA GG6CTGGCCT TG6TCTQAG0 GGGCTGCACT 300 

CACGTCAGCA GAGGGAGGGT CCCAG6CCCT GCCAGGAGTC CAGGTGCA6A CTGAGGGGAC 360 

CCCACTCAOC AAACACAGAG GACCTAGCCC CACCCTQCCC CTTGTGTCAG CTGAGGGAAG 420 

COGCTGGGTG GATGGACTCC CCTCACTTCC TCTTCAGGTG TCTCCTGGAG ATAGGGCCTC 480 

AGGTCAACAG AGGQAGGGTT GCAOACOCTG CAG6CATCAA GATGAGGAOC AGGCAGTATC 540 

CTCACCOCAG GACACATGGA CCCCATTGAA TTTAGACATC TCTTACTGTA CTTCOGAGGA 600 

AACCCTGGGC AGGTGTGGGC AGATGTTGGT TGGGGCATGT CCTTCTGTTC CATATCaUSQG 660 

ATGTGAGCTC CTGATCTGAG AGACTCTCAG GCAAGTAQAQ GAGTAOAGTC CAGTCCCTGC 720 

CAGGAGAAAG GTCAGGGCCC TGAGTGAGCG CAGAGGGGAC CATCCAOCOC AAAAGTGTGT 780 

AGAACTCAAO AGTGTCCAGC UCOCUCAtTr GACAGCACTG AGGGACGGGO 6CTCTGCCTG 840 

CAGTCTGCAG CCTAAGGGCC CCTOSATTCC TCTTOCAGGA GCTCCAGGAA GCAGGCAGGC 900 

CrrGOTCTOA GACAGTGTCC TCAGGTOGCA GAGCAGAGGA GACCCA6GCA GTGTCAGCA6 960 

TGAAGGTGAA GTGTTCACXX: TGAATGTGCA CCAAGGGCCC CACCTGCCCr AGCACACATG 1020 

GGACOXATA GCACCTGGCC CCATTCXXXX: TACTQTCACT CATAGAGCCT TGATCTCTGC 1080 

AOQCTAGCTG CAC6CTGAGT AGCC3CTCIGA CTTOCTQOCT CAGGTTCTOG GGACAGGCTA 1140 

A0CAG6AG6A CAGGAGCCCC AA6AG6CCCC AGAGCAGCAC TGACGAAGAC CTGTAAGTCA 1200 

GCCTTTOTTA GAACCTCCAA GGTTOGGTTC TCAGCT6AAG TCTCTCACAC ACTCCCTCTC 1260 

TCCCCAGGCC TGTGQGTCTC CATCGCCCAG CTCCTGCCCA CGCTCCTGAC TGCTOCCCTG 1320 

ACCA6AGTCA TCAT6TCTCT CGAGCAGAGG AGTCOGCACT GCAAGCCTGA TGAAGACCTT 1380 

GAAGOOCAAG GAGAGGACTT 6GGCCTGATG GGTGCACAGO AACCCACAGG 08AGGAGGA0 1440 

GAGACTACCT CCTCCTCTGA CAOCAAGGAG 6AGGAGGTGT CTOCTGCTGG GTCATCAAGT 1500 

CCTCCOCAGA GTCCTC3VGGG AG60GCTTCC TCCTCCATTT CCGTCTACTA CACTTTATGO 1560 

A6CCAATT0G ATGAGG6CTC CAGCAGTCAA GAAGAGGAAG AGCCAA6CTC CTGGGTCGAC 1620 

CCAGCrCAOC TGGAGTTCAT GITOCAAGAA GCACTGAAAT TGAAGGTGQC TGAGTTGGTT 1680 

CATTTOCTGC TOCACAAATA TOQAGTCAAa OAOCOGGTCA CAAAGGCAGA AATGCTGGAG 1740 

AOOGTCATCA AAAATTACAA GOGCTACTTT CCTGTGATCT T0G6CAAAGC CTCC36AGTTC 1800 

ATOCAGGTQA TCTTTGGCAC TQATGTGAAG GAGGTGGACC COGCOGGCCA CTCCTACATC 1860 

CTTGTCACTG CTCTTGGCCT CTOGTGCGAT AGC3VTGCTGG GTGATGCTCA TAGCATOCCC 1920 

AAGGC06CCC TCCTGATCAT TGTCCTGGGT GTGATCCTAA CCAAAfiACAA CTGCGCCCCT 1980 

GAA6AGGTTA TCTGGGAAGC GTTGAGlXnG ATGOGGGTGT ATGTTOGQAA GGAGCACATG 2040 

TTCTAGGGGG AGCCCAGGAA GCTGCTCACC CAA6ATTGGG TGCAOGAAAA CTACCTGGAG 2100 

TACCGGCAGG TGCCOGGCAG TGATCCTGOO CACTACGAGT TCCTGTGGGG TTCCAAGGCX: 2160 

CAOOCTQAAA CCAGCTATGA GAAGGTCATA AATTATTTGG TCATGCTCAA TGCAAGAGAO 2220 

COCATCTQCT ACCCATCCCT TTATGAAGAG GTTTTGGGAG A66AGCAAGA GGGAGTCTGA 2280 

GCAGCAGCaS CAGC06GGGC CAAAGTTTGT GGGOTCAGGQ OOOCATCCAG CAGCTGOOCT 2340 

GCOOCATGTG ACATGAG6CC CATTCrTGGC TCTGTQTTTG AAGAGA6CAA TCAGTGTTCT 2400 

CAGTGGCAGT GGGTGGAAGT QAOCACACTG TATGTCATCT CTGGGTTCCT TGTCTATTGG 2460 

GTSATTTQQA GATTTATCCT TGCTCCCTTT TGGAATTGTT CAAATGTTCT TTTAATGGTC 2520 

AGTTTAATQA ACTTCACCAT CGAAGTTAAT GAATCACAGT AGTCACACAT ATTGCTGTTT 2580 

ATGTTATTTA GGAGTAAGM TCTTGCTTTT GAGTCACATO 08QAAATCCC TGTTATTrTG 2640 

TX3AATTGGGA CAAGATAACA TAGCAGAGGA ATTAATAATT TTTTT6AAAC TT6AACTTAG 2700 

CAGCAAAATA GAGCTCATAA AGAAATAGTQ AAATGAAAAT GTAGTTAATT CTTGCCTTAT 2760 

A ccivrrr cT ctctoctgta aaattaaaac atatacatgt atacctggat ttgcttggct 2820 

TCTTTGAGCA TGTAAQAGAA ATAAAAATT6 AAAGAATAAT TTTTCCTGTT CACTGGCTCA 2880 
TTTTTTCTTC ACaoVOGCAC TGAACATCTG TTATTCGGAA CaCXXTGGGT T 



Seq ID NO: 162 Protein sequence i 
Protein Accession ft: AAA68877.1 



247 



wo 02/086443 



1 11 31 31 41 51 

1 I I I I I 

KShEQRSPaC KPDEDLSAQG EDLGLMGAQB PTGEEEETTS SSD5KEEEVS AAGSSSPPQS 60 

PQGGASSSIS VYYTLWSQFD EGSSSQEEEE PSSSVOPAQL EFMFQEALKL KVAELVBFLL 120 

BKYRVKEPVT ECAEKIiESVIK HYKRYPPVIF GKASEFMQVI FGTDVKBVDP AGHSYXLVIA 180 

LGLSCDSMLG DCBSMPVAXL LIXVLGVXLT RDHCAPEEVZ WBALSVMGVY VGKEHMFVGB 240 

PRKLLTQDHV QEEmiBYRQV FGSDPAHYEF LHGSXABAET SYBmNYLV MLNARBPZOr 300 
PSZiYSSVXiGE EQB8V 

Seq ID MO: 163 DHA sequence 
Nucleic Acid Accession 8t AF292100 
rorting sequencet 30-809 

1 11 21 31 41 51 

I I I I I I 

6GGGG666AG AG6GCTGGAG GACACCAACA T6AACAAGTT GAAATCAT06 CAGAAG GATA 60 

AAGTTCGTCA GTTTATGATC TTCACACAAT CTAGTGAAAA AACAGCAGTA AGTTGTCTTT 120 

CTCAAAATX5A CTGGAAGTTA GATGTTGCAA CAGATAATTT TTTCCAAAAT CCTGAACTTT 180 

ATATAOGAGA GAGTGTAAAA GGATCATTG6 ACAG6AAGAA GTTAGAACAG CTGTACAATA 240 

GATACAAAOA CCCTCAAGAT GA6AATAAAA TTGGAATAGA TG6CATACA6 CAGTTCT6TG 300 

AT6ACCTGGC ACTC36ATCCA 6CCAGCATTA GTGTGTTGAT TATTGOGfTGO AAGTTCAGAG 360 

CA6CAACACA G7GCGAGTTC TCCAAACA66 AGTTCATGGA TG6CATGACA GAATTAGGAT 420 

GTGACAGCAT AGAACAACTA AAGGOCCAGA TACCCAAGAT GQAACAAGAA TTGAAAGAAC 4 BO 

CAG6ACGATT TAAGGATTTT TACX3W3TTTA CTTTTAATTT T6CAAAGAAT CXAGGACAAA 540 

AA6GATTAGA TCTAGAAATG GCCATTGCCT ACTG6AACTT AGTGCTTAAT GGAAGATTTA 600 

AATTCTTAGA CTTATGGAAT AAATTTTTGT TGGAACATCA TAAAOGATCA ATACCAAAA6 660 

ACACTTGGAA TCTTCTTTTA GACTTCAGTA CXSATXSAITGC AGATGACATO TCTAATTATG 720 

ATGAAGAAGG AGCATGGCXT GTTCTTATTG ATGACTTTGT GGAATTTGCA OGCXXTCAAA 780 

TTGCTGGGAC AAAAAGTACA AC31GTGTAGC ACTAAAGGAA CCTTTTAGAA TGTACATAGT 840 

CTGTACAATA AATACAACAG AAAATTGCAC AGTCAATTTC TGCTGGCTGG ACTGAACTtSl 900 

AGATCAATCC TCACAATTCA GACTGAGGGT TGAGACAAAA CTTTAAGGAT ACATCTTGGA 960 

CCATATOGTA TTTCATTCTT CTAATGGTGG TTTGGGCTrG TCTTCTAGTC TGGGCOSCTC 1020 

TAAAC3^TTTA TAATTCCAAC ATTGTGGATT TCATCTTATA TCTGTGGACX: ATCCTAGTTT 1080 

ATTCTCCCAT AAGTCTTAGA AGCTTTATGG TGATTATTTT QAGGTTTTCA TTCTCGCATA 1140 ' 

AAGCACAATG CTGTCTTCAT CAGAAAACAG TTGGCATAAG AATTAAACAT ATGAACATCA 1200 

CAAAACAATT TATAAAAACT TCTTAAATAT ACGCTTTGGG CTAGTTGCAA AGACTATGCT 1260 

AATAOCACTT CCAQTGA6A6 TGATATATTT AA6TGTACTG 6ATCTGGAAT GGTGTTTTGG 1320 

TTTGGGGOGA ATTTTTTTTT TTTCCTGGCA AATCACATAT GTTGTTGATG TGAGTATCTG 1380 

ATGAAAAAAC AATGTCAGAA TAACXXSACAT GAAAATTTTT TAGGATAACT TGGTGCCTAC 1440 

CTGAAAAATG TATTGTGTTT TAGACTCTTG ATTTCAAAAG GTTCCACAGA ACTAGTCTGC 1500 

GCTTACCTTA CCCATGTTTA TATATAGCTG TCCTACAGGG AGCTTTTATT TAGAAAATGT 1560 

CTOCATAATG TTAGATTCTT CTCCT6TCTA CATTATGCAC TACATAATTG GACTTCATTA 1620 

TGCTTTTGAA ATGCTTATCT GOCTGTCACA TAAGTTAAAC TATTTAATTT GTTTTGAAT6 1680 

TTTTGGATTG CTACACAATA CAATATTCTA AATTTAGGCA TGAGGGTTTT TTTGTTTTAT 1740 

TTTTACTTTT TTTTTGTCAT TGCACTATGG AACACAAATG AAATTCTCTT AATTTATAAG 1800 

AAGATAGTAG GAGTTAAATT TTGAAAATGG TTGTGATGA6 COkOGAAATT CAATCTTTAT 1860 

AATATAGGTA CIOCTCTTTC AGACAAACAG TCCATTTTTA ATGACTTCTT ATTTTGTTGA 1920 

AATTACTTTA ACTOCTAATC ACTGTGGTT6 CCAAATATTT ACTTCAQAAS CAAA6ATTTT 1980 

CAAACAAGCA TACACGATGC AAAATACCAG TCTGGCTTCT A6TCTATTTA Ci ' GTr r mrr 2040 

TCACTCAGAT TAGCTCAGTT TTCTCATCAA AGCAGAATGC TATCTTOOGT GTGTGTGTGT 2100 

GTGTGTGTGT GTGTGTGTGT GTATGTGTGT ATATATATAT ATATATATAT ATATATATTT 2160 

rmvm"rr ttttttttaa attacaaaag ocazgagctg cttttatgct gaaaatggtc 2220 

ATTTCCCTGT TCACTTACTG ACATGT6AAG AAGGGTTTCT TGCTTTCTTA AACATTTCOG 2280 

TAAQGCAGGC TAGAAATGTA ATACTTCAAA TGTTTGATGA TTATGGTCTT TTGATAGGAA 2340 

TAGATTCTGC TTGGGATATA TATCCAGGCA CTCTCTAAGO TCTAGGGTTG ATATTAACAA 2400 

AGGAATGTAC TTAGAATAGC AGTACATTTT ATGCAAATAT GGAAATTATT TTAAGAAACA 2460 

ATGACATATC AAAACTOCTT TTTACATGAT TTTQAAATAG ACTAGAAA8C TTTGCCTATA 2520 

6ACATATTAA TATTCCAATC ATAACTTTAA TTCAA6AATG CAGTTTTACC AAAA6AAAAA 2580 

TTTGAAAATT TCTATTCAGG CTACTGGAAT TGGTTATTAA AAGAAAAAGG AAAAAGAAGA 2640 

ATCTTGCTGC TTTCAGTATT TCCTGATTTT TTTGTAAATA TAAAGAGGAA CTTCAATTAT 2700 

GAAAAATTTT TAAAAGATAT ATATATCTAT ATATCTATAT ATATGTACTG TTTTGTTTOC 2760 

TGTCTTGAA6 ATTITGAGTT ATOGTTATTG GTTTCAGKTT GATTAATTCA CATATGCTGT 2820 

GTTJTCrTTA AAAGTCATAT GGGTTCGTGG CCTAATGCCT TGGATTTTAC ATATTTTTCT 2880 

TTTTAAATGC AAAACCTTTT CAACAAAATA GTGTTTGTCA TCAGGTTGGT ACTAAACATT 2 940 

TATAATTACT GTGTAATTAT AAACAAAAAT ACATAAAGCT TTGAATATAA TTATGTAOCA 3000 

TAAAAGTTAA GGTTGTTCAC TATGATGGCA TCTTAGAATT AAACAAAACT TTTACTAGGG 3060 

CTGAAAAGAG AAGACTGATT TAATGTGGT G TGATTATTCT GAAGATAAAT GTCTGGCTAC 3120 

AGGGAATATT TTGTACTAAA AAATGATTAC ACATATGGCT G Ttfi'bTG'm' GAGTCTGTGT 3180 

CrOTGAGAGA GCCAGA6A6A GTGAGAGA6A TTGACAGAGA AAGGGAGAGA CACACACAOG 3240 

CCCCTTOAAT TGCTTTAACT CCTAAGTGTT TCAGTCCTCA TTCCX3GTAAA CTCCCCATGC 3300 

T6ATTCTTTG TTTTAAACTG AACCATAGGT ACAGTTTCCT TTTIGCCAAA TGTCAAAACA 3360 

GGTACAAATT TTAAAATGTA ATGCTTTTTA AATA6AAAAA TGTATAAAAT TAGAASTGCC 3420 

CACATATAAA AAATACTTGA GATGAAOATT ATCTTTAGT6 AATATCATCT GCAIATCTCT 3460 

GTAAQTTCAA TTGTGTTTCT TACAGTCCCT GTCATATTAC CAACAGAGGC AATAAAAGCr 3540 
GCAGTOAAAT TG 

Seq ZD MOi 164 Protein sequence: 
Protein Accession St AAG00606 

1 11 21 31 41 51 

I I I 1 I I 

MKKUC8SQKD KVRQFMXFTQ SSEKTAVSCL SONDWKUTVA TDNFFQNPEL TZRBSVKGSL 60 

DRXKLEOLYN RYKDPQDaiX ZQZD6ZQQFC DDXALDPASZ SVLIZAWKFR AATQCEFSKQ 120 

EFMDGMTELG CDSIEQIiKAQ IPXKBQELKB PGRFKDFYQF TFHFAKKFGQ KQU)IiEMAIA ISO 

YWNLVLEIGRP KFXiDLHNKFL XiEBHKRSZPK DTHNLLLDFS IMZADOMSOT DEEGANFVLI 240 
ODFVBFARPQ IA6TKSTTV 
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Seq ID £10: X65 DHA sequence 
Nucleic Acid Accession #: AF256215 
Coding sequence: 220-2028 

1 11 21 31 41 51 

I I I I 1 I 

CTCCAGTCCG CATGCTOU3T AGCTGCTGCC GGCCGGGCTG OGGGGOGGCG TCOSCTGOGC 60 

GCCTACGGGC TGOGGTGGCG GCCGCOGCGG CACCOGGCAG GGCCC6CCAG TCCCCGCTTC 120 

CCTGCTCCA6 AGCCGCC6CC TGGGCCX3GGG CAGGGOSGGC CCGG6GCTCC TCCATGCTGC 180 

CAGCCGCCQG GCTGGGGAGC GGACCAAGT6 GCTCCTGOGA TGGOGGCGGA AGAGGAGGCT 240 

G0GGCX3GGAG GTAAAGTGTT GAGAGAG6A0 AACCAGIGCA TTGCTCCTGT GGTTTCCA6C 300 

CGC6TQAGTC CAGGGACAAG ACCAACAGCT ATOGGGTCTT TCAGCTCACA CATGAOWSAG 360 

TTTCCACGAA AACGCAAAGG AAGTGATTCA GACXCATCCX: AAGTGGAAGA TGGTGAACAC 420 

GAAGTTAAAA TGAAGGCCTT CAiSAGAAGCT CATA6CCAAA CTGAAAAQ06 GAGQAGAGAT 480 

AAAATGAATA ACCTGATTGA AGAACTGTCT GCAATGATCC CTCAGT6CAA CC CCAT GGCO 540 

CGTAAACTGG ACftAACTTAC A6TTTTAA6A ATGGCTGTTC AACACTTGAG ATCTTTAAAA 600 

G6CTTGACAA ATTCTTATC?r OGGAAGTAAT TATAGACCAT CATTTCTTCA GGATAATGAG 660 

CTCAGACATT TAATCCTTAA GACTGCAGAA GGCTTCTTAT TTGTGGTTGG ATGTGAAAGA 720 

GGAAAAATTC TCTTCGTTTC TAAGTCAGTC TCCAAAATAC TTAATTATGA TCAGGCTAGT 780 

TTGACTG6AC AAAGCTTATT TGACTTCTTA CATCCAAAAO AT0TT6CCAA A6IAAA0QAA 840 

CAACTTTCTT CTTTTGATAT TTCACCAAGA GAAAAOCTAA TAGATGCCAA AACTGGTTTG 900 

CAAGTTCACA GTAATCTCCA CGCTGGAACG ACACGTGTGT ATTCTGGCTC AAGAOGATCT 960 

TTTTTCTGrC GGATAAAGAG TTGTAAAATC TCTGTCAAAG AAGAGCATOS ATGCTTACCC 1020 

AACTCAAAGA AGAAAGAGCA CAGAAAATTC TATACTATCC ATTGCACTGG TTACTT6AGA 1080 

AGCT G OCCTC CAAATATTGT TGGAATGGAA GAA6AAAGGA ACAGTAAQAA AGACAACAGT 1140 

AATTTTACCT G0CTTOTQ6C CATT8GAAGA TTACAGCCAT ATATTGTTCC ACftGAACAGT 1200 

GGAGASATTA ATC5TGAAACC AACTGAATTT ATAACCCGGT TTGCAGTGAA TGGAAAATTT 1260 

GTCTATGTAG ATCAAAGGGC AACAGOGATT TTAGGATATC TGCCTCAGGA ACTTTTGGGA 1320 

ACTTCXTGTT ATQAATATTT TCATCAAGAT GACCACAATA ATTTGACTGA CAAGCAGAAA 1360 

GCAGTTCtAC AGAGTAA06A GAAAATACTT ACAGATTCCT ACAAATTCA6 A6CAAAAGAT 1440 

GGCrCTTTTG TAACTTTAAA AAGCCAATGG TTTAGTTTCA CAAATCCTTG GACAAAAGAA 1500 

CTGQAATATA TTGTATCTGT CAACACTTTA GTTTTGGGAC ATAGTGAGCC TGGAGAAGCA 1560 

TCATTTTTAC CTTGTAGCTC TCAATCATCA GAAGAATCCT CTAGACAGTC CTGTATGAGT 1620 

GTACCTGQAA TGTCTACTGO AAC3M3TACTT GGTGCTGGTA GTATTGGAAC AGATATTGCA 1680 

AATGAAATTC TGGATTTACA GAGG7TACA6 TCTTCTTCAT ACCTTGATGA TTGSAGTCCA 1740 

ACAOGTTTAA TGAAAGATAC TCATACTCTA AACT6CAG6A GTAT6TCAAA TAAGGAflTTO 1800 

TTTCGACCAA GTCCTTCTGA AATGGGGGAO CTAGAGGCTA CCAOQCAAAA CCAGAGTACT 1860 

GTTGCTGTCC ACAGCCATGA GCCACTCCTC AGTGATGGTG CACAGTTGGA TTTOGATGCC 1920 

CTATGTGACA ATGATGACaVC AGCXaTGGCT GCATTTATGA ATTACTTAGA A6CAGAGGGG 1980 

GGCCTGGGAO ACCCTGGGQA CTTCAQTGAC ATCCAGTGGA CCCTCTAGCC TTTGATTTTT 2040 

AACTCCAAAA ATGAGAAACA TTTTAAAGCS^ TTATTTAOGA AAAAACTGTC TCAACTA7TC 2100 

TTAAGTACTG TATTGATATT GTTTGTATCT TTTATTAATG TTCTACCACT TTTTATAGAT 2160 

TTGCATCTTC CTGTCACAGG GATGTGGGGA AATACGTTTT CCTCCCAAGA GAACCAAGTT 2220 

TATTATAGAC TCCTTTATTC AGTGAAATGG CTTATAATCC ACTAGTTGCC ATATTTTTGC 2280 

TAAAATATTT CTAACCAAGA ATACTACTTA CATATTGTTT TGGCTTTOTT TtATTTTTGA 2340 

TOCABTTTTT TTTAGTTGAG GTAATGTAAT ATATTGATGT TTTCCTTT6T GTCTAAGATT 2400 

GATTTATAAT AGTAGGTTTG TATAATTT GG AACATTTTCC ATGCCTTGCG AATTTCCTTA 2460 

ATTGAGGATA GGGCTTACAC ACTTTAAGAA AACAGTGAGT ACTTGAACAT TTAAAOGGAC 2S20 

AGTGCAATTT ATAGTCATAA TCACATTGAA TACTGTATTT GATCTTTGGA GACTTAGGCA 2580 

AGCACAGA6C TGGGATATTT ATGCTCAGTT GAGCACTTTA AGATGAATTT TAAGTGA6AT 2640 

GATTTCTTGC TTAAAACTCA GAAAGTCAAA AGAGTTTCAG CTTTCCTTAC AGAAAAOGAA 2700 

OGATCTTGGG CCCTAGATCT TGGGGATTAA CCTCTGCATA TAAGATTTAC TCTTAATAGG 2760 

CCAGAOGTGG TGCTCAOGCC TGTAATCCXav GTACTTTGGG AGGCTGAGAC GGGCAGATCA 2620 

CTTGAGGTCA GGAGTTCAAO ACCAGCCTGG CCAATATGGT GAAACCCCGT TTCTACTAAA 2880 

AATACAAAAA AAATTACCCA GGCACTCACT CTTGAGGTAA CTAACCAACT CCCAOGATAA 2940 

TGACAGTCCA TTCATGAGOO CAAAGGCCTC ATGACCTAAT GGCACACACC TGTAATCCCA 3000 

ACTGCTTGGO AGOCTGAQGC GAGAGGATTG CTTGAACCTG GGAGGCAGAG GTTGCAGTGA 3060 

6C0GAGAT06 CACCACT6CA CTCCAGTCT6 GGCAACAGAG TGAGACTTCA TCTCAAAAAA 3120 

AGTAAA7AAA AAGATTTAAT ATAATCACTG AAGATCTCTA TTATAGATAG ATTAGGTTTT 3180 

TGACATTGGA AACATACTTA 6GGATAGATT TGTCCTAAAQ GAAAAAAGTA GGOCOGGGCA 3240 

GATTAAATGT CTTGTQTAAA GTCACACATT AAATTCAGTC ACACATTAAA TTCATAGAGT 3300 

TTTAAATOTT TAATOTATAT AAACCAGTTT CTTTATACAC ATTTGGGAAA ACATrGGTCT 3360 

CACRGATTAA ATGATTAACT AACTGACCCA GGAACTA6TT GTAGCTTTCT AABTAATTAQ 3420 

GCAATTACAG TTATTGCCTG TAACC3UUUX5 TAATAAAACA AAATGAOUVQ TACATGTTTA 3480 

AAATTATGAG GCAATGAGAA ATAATTTAAA AACCAATTTT CTAGTTATAA TTTAAAATTT 3540 

GGAGA6CATT TTTAACAQTA ATTAATCCAG AGGTGGCTCA AATTGAGTAT AAGAATTAAG 3600 

ATTATTTAAA ATACTOCAtO TCTACCTTCT CX3GGGATCAT ACTTTATAAC ACTTTCTGCT 3660 

TCA9I7USCTC TTCATAGCTT 6CCAAGTATG CTCCCATATT TTCTCTCTOG TGCCTOGCAA 3720 

ATGAAAGTCA GATAGGCTGG GAACTCATGG GGCAGCCCTC AGACTTCAAT GTG6GCTTCA 3780 

AATCCAGTTT CCTGTTCTAT ATGGTGCTAC ATCTTTOCAO AAAATTTCCC TCAGAGCCCC 3840 

TCGCX3VAAAC AAAGCATTAT TTTGACXCTO CATGCTATTI CTTTAGCTGT AGGTGATAQA 3900 

TTAOAACTTC TGTCAGACAT GTTAATGACA AACATACCAA CAQACAATAA CCAAAGCAAA 3960 

TGTTTOCTTC AAGTOTGAAA TGTOCAOGGO CTCGTOGGCA AGGATGTATT GGCACACTGT 4020 

CCTCTTGAAC TGATAGTGTC CCAGCAATGT TGGAGGTTGO CACCATTCCT GGTOOGACAC 4080 

TTGAGGACCT GAGAQACATC AGGTTTAGAA TGAGCCAAAG AAATCCTACA AGATGQGGAG 4140 

AAT T GGTGTG CAGCAGCCTA AGTGTTATAG TTAAGTCTAA AOAAGTATGA AAGATCCCCT 4200 

GT G TTCTCTA AATTGAGCA6 AOGGGCXriGC CtACCAATAT CACTTTTTAG G0QACT6AAC 4260 

CATTQCAOGT TAGACTTOGC TTCCAAAGAa TCTGCCIAAQ CCAGQG6TGG CAGGGTA6GC 4320 

CATCATAGCT GGATGGCCTC AAAAGCA6AT GGGGGCAGAC TTGCOCTGGT GATGCCA6GA 4380 

TTTGAGAGGC AGAGTTTCTA GA6GGAGAC3C AGTGCTGOCT CTCACAGTGG GAGTTTTTTC 4440 

TCTTT6CAAG AGGAGGGGCT GTTCAATTCC ATAGACCAGT QGGCAQATAG CCAGTTGAAT 4500 

ACTCIUTGCA TGGTTTGATC CTTTATTAGT TOGCTCTAAT ATTTTTCTGT AGATOCTTTT 4560 

GTCCTOGACT CAAAATCTAA TCCATGCATT GTATG ATACC GTAGCTC TOC TAAGGTTTGT 4620 

GTTTOCTTCA AAATGITTTA GTTTTCTTCA ACTAAATTTG ATTTTTGCT6 TTAGAAGTGA 4680 

CATATTTTTA TGGTATACAC TATGTTCCTT TTTTCTACTG OGAGTCAATT TTTTGAATTT 4740 

TGGTGAGAAA GAATATATCT ACAAATTGCA 06AAA6TATC ATAAAAACA6 TACTCTAGA8 4800 
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CAGOSCTGTC CAHTAGAAAT ATAATCTGA6 OCAOITGTAT AATTTTATTT TCTTCTA6CC 4860 

ACATTAAAGA AGTAAAAAGA TACAAGTAGA AC7AATTTTA ATGTmAAT TCAGTATATC 4920 

C3VAAATATCA TTTGAACATG TAATTAATAT AAAATTATTA ATGTGATATT TTACATTCTT 4980 

TTGGTAATAC TAGTCTTCAA AATCTGGTAT GTATCTTACA TTGATAGCAC ATCTCACTTT 5040 

6TACTAGCCA CATTGCAAGT 6CTCAGTAGC CACATGTGGC TAGTGGCTAC TGCACTGGAC 5100 

A6CACAGTTC TAGGTTCCAC OCEAACAOQC AAOTOCTOTO QATTAGAATC CCA6AATCAG 5160 

AGCTGSU^GT AAACATAGA6 ATCAAACCTC CTTTTAAAAA TtSAOGAOGCT GAG6CACAGA 5220 

GTTTAAATGG CTTGCATGAG GTCATACAGC TAAATTCAGC CTCAACAGGG TCTTCTGATT 5280 

CCAGGCACTC TTCCCACTCC ACTACATTAC TGTA£?IGGTA ATTCTTAGGG TTAAAAAAAG 5340 

TGTAGAGTA6 G00GGGC6CA GTGGCTCATG CCTGTAATCX: CAGCACTTTG GGAGGOOGAA 5400 

GTOS G CGGAT CAOSAGGTCA G6AGAT06A6 ACXATCCIGG OCAACATGGT GAAACCOOGT 5460 

CTCTACTGAA AATACAAAGC AAAATTA6CC AGGTGTGGTG G0GG60GCCT 6TGGTCCCAG 5520 

CTGCTCTGGA GGCTGAGGCA GAATGGCGTO AACCCAGGAG GCA6AGATGG CAGTGAGCCA 5580 

AGAT06GGCC ACTGCACCCC AGCCTGGGCG ACAGAG06AG ACTCCATCTC AAAAAAAAAA S640 

AAAAAAAAAA AAGAAAAGAA AAGAAAA6TC TAGAGAACAT TATATTAAGT G6TTATTATT 5700 

GAAGTAGACC AAAGTTTATA OCATAAGGAT ATTTTTOCTT AAATACCAT6 TTTGAAGAAC 5760 

AATTATTTAT TGATCCTTGA ATCTGTAAGA TCAAATAACA AGTCTCTATC CATGTTACCA 5820 

AATTTAACXrr TTTGAAAATA ATAAACTTTA AAATATC3VGA TGTGTTATTA CAGGAT6ATA 5880 

CTT6GAATCA AGTGAAATGA CTTATATGGT CATCACTAAA TTTAGAAATC TATTGTGAAA 5940 

CAAAGACAAA CAGGAAAGTA CAGAATAGAG ACTTTTAGTA AATAAATGGA ATTTAAAAGA 6000 

AAGTGTTTAT TTACAGTGTC AOGACAGAAA AGGAT6TCTT TGTTGTCATA GTCTTTGAGG 6060 

GATCTCCX3TA AAATCTGGGG CACAGGTACA AGAAATAGCC AATATTTAGT TCCCAGACCA 6120 

TGTTTAGTAG TGTCCAGTTT CAGATCATGC TGCCAAGAGQ TATCTCCCCC TCAGGTGGGT 61 BO 

CATCACTGAG CCCTGGAATT GGAGACTCAT ACTTGCCCAG CACAATGTTA CGGGCAGACA 6240 

GGCCGACATC TATGATTAGC TAGAAGCCAT AAAGAAAAGC TGCTAAGTGG (XACTAGGTG 6300 

CCACTTTTCT GTTTTTGTAA TGCTTTCATT AGCAGATCTT TTTTTTCCAA GCTCCATGGG 6360 

GCCTATGAGA GGCATTTATG ATTTTTGTGC CTACAATAAG TCAGOCTGTC TGGTGTGAGT 6420 

TGTTTTATGA GAAATGCTTT CCAAGGQAGG TCTAGGAAGA TCCTGACACA TAAGAACTTT 6480 

G6CTTAGAGA GCTTTCCAGG TGTAGTOCCA ATAAAAACTG ACXTGGAAAG AAAACCTGCC 6540 

CAGCAC3GGAA CATGCTTTCT GAACTCACTT GAGAGTGTAT GGTGTATGTC ACTTCTCATA 6600 

TATTCTTGAG TTTAGATTTG TCTTTTATAC AATTTTTAGC TCTTTTCCAG TTCACTTGTG 6660 

CTCGTCTGTA TATTGGTATT TTTAAATTTT TGTGGTAAAT AATGAAAAGA GTGAAATTAT 6720 

ATTTTATAAT TACTCATTTG TAGTTTTTTT .TTTTAATTTA ATAAACTTCC TCCAAAAAGT 6780 
GCTCCCTTAA AA 



Seq ID HO: 166 Protein sequence: 
protein Accession «: AAG34652 

1 11 21 31 

I 1 i I 

MAAEBEAAAG GKVLREENQC XAPWSSRVS PGTRPTAMGS 
QVEDOEHQVK MKAFREABSQ TSKRRRDKMH NLIEBLSAMI 
QBLRSLRGLT NSW6SNYRP SFLQDNELRR LILKTAB6FL 
LNYDQASLTG QSLFDPMPK DVAKVKBQI*S SPDISPREKL 
YSGSRRSFFC RIKSCKISVK EEHGCLPNSK KKKHRKFYTI 
NSKKDNSNFT CLVAIGiOiQP YIVPQNSGEI NVKPTEFXTR 
LPQBLLGTSC YEfFBQDDBS IILTDKHKAVL QSKEKZLTDS 
TMPWTKBZiBy IVSVMTLVLO HSEPGB^&SFL PCSSQSSEBS 
SZGIDIANEI LDLQRIiQSSS YU3D88FT6L MXZyTRTVNCH 
TRONQSTVAV HSREPLLSDG AQLDFOALCD MSDTAKAAFM 
Til 

Seq ID KO: 167 OKA sequence 
Nucleic Acid Accession Us NM_014400 
Coding sequence: 86-1126 

1 11 21 31 41 51 

I i I I I I 

QGTTACTCAT CCTGGGCTCA GGTAAGAGGG CCCGAGCTG8 GAG6CGGCAC ACCCAGGGGG 60 

GACGCCAAGG GAGCAGGACG GAGCCATGGA CCC06CCAGG AAAGCAGGTG CCCAGGCCAT 120 

GATCTGGACT GCAGGCTG6C TGCTGCTGCT GCTGCTT06C GGAGGAGOGC AGGCCCTGGA 180 

GTGCTACAGC T006TGCAGA AAOCAQATGA 0G6ATGCTCC COGAACAAGA TGAAGACAOT 240 

GAAGTGOQOG COGGGCGrTGG ACGTCTGCAC CGAG6C0GTX3 GGG6C6GTQG AGACCATCCA 300 

CGGACAATTC TCGCTGGCAG TGCSGGGTTG 0GGTTC3GGGA CTCCCCGGCA A6AATGACCG 360 

OGGCCTGGAT CTTCAC3GGGC TTCTGGCGTT CATCCAGCTO CAGCAATGOG CTCAGGATCG 420 

CTGCAAC3GCC AAGCTCAACX: TCACCTOGCG GOCSSCTOGAC CCGGCAGGTA ATGAGAGTGC 480 

ATACXaXXX: AAOGGGSTGG AQTGCTACAS CT G TgrOG GC CTGAGCXX^OG AGGOGTGGCA 540 

GGGTACATOG C05COQGTCO TGAGCTOCTA CAAOGCCAGC 6ATCATGTCT ACAAGGGCTO 600 

CTTOGACGGC AACSTCACCT TGAC3GGCAGC TAATGTGACT GTGTCCTTGC CTGTCOSGGO 660 

CTGTGTCCAQ 6ATGAATTCT GCACTCOGGA TGGAGTAACA GGOXAGGOT TCACGCTCAO 720 

TGGCT0CT6T T6CX31GGGGT CCC3GCTGTAA CTCTGACCTC CGCAACAAGA CCTACTTCTC 780 

CCCT06AATC CCACOOCTTG TCGQGCTGCC CCCTCCAGAG CCCACGACTG TGGCCTCAAC 840 

CACATCIGTC ACCACTTCTA CCTGGaCCCC AGTGAGACCC ACATCCACCA GCAAACGCAT 900 

GCCAGOGCCA ACCAGTCAGA CTCGGAGACA GGGAGTAGAA CACGAGGCCT CCCGGGATGA 960 

GGAGOCCAGG TTGACTGGAG GOGCOSCTGG CCACCAGGAC C6CAGCAATT CAGG6CA6TA 1020 

TCXTTGCAAAA Q6GGGGCCCC AGCAGCCCCA TAATAAAGGC TGIGTOGCTC CCACAGCTGG lOBO 

ATTGGCSkGCC CTTCTGTT06 0C3GTGGCTGC TGGTGTCCTA CIGTSU3CTT CTCCACCTGG 1140 

AAATTTCCCT CTCAC3CTACT TCTCTOGCOC T6GGTACC0C TCTTCTCATC ACTTCCTGTT 1200 

CCCaCCACTC GACTGGGCTG GCCCAGCCCC TGTTTTTCCA ACATTCCCCA GTATCCCCAO 1260 

CTTCTOCTGC GCTGGTTTGC GGCTTTGGGA AATAAAATAC CGTTGTATAT ATTCTGQCAG 1320 

GGGTGTTCTA OCTTTTTGAG GACA6CTCCT GTATCCTTCT CATCCTTOTC TCTCOGCTTQ 1380 

TCCTCTTGTG ATGTTAGGAC AGAGTGAGAG AAGTCAGCTG TCAOGOGOAA GGTQAGAGA6 1440 

AGGATGCTAA 6CTTCCTACT CACTTTCTCC TA8CCAGCCT GGACTTTGGA 60GT666GTG 1500 

GGTGGGACAA TGGCTCCOCA CTCTAAGCAC TGCCTCCCCT ACTOXCGCA TCTTTGGGGA 1560 

ATOGGTTCCC CATATGTCTT CCTTACTAGA CTGTGAGCTC CTOGAGGGCA GGGACCGTGC 1620 

CTTATGTCTO TGTGTGATCA GTTTCrGGCA CATAAATGCC TCAATAAAGA TTTAATTACt 1680 
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TTCTATAGTG AAAAAAAA 



PCT/US02/12476 



Seq ID NO: 168 Protein sequence: 
^ Protein Accession ft: MP_0S521S 

1 11 21 31 41 51 

I I I t I.I 

MDPARKAGAQ AMIWTAGWLL LLLLRGGAQA LECYSCVQKA DDGCSPNKMK TVKCAPGVDV 60 

CTEAVGAVET IHGQFSLAVX GCGSGIiPGKU DRGLDLHGLli AFIQLQQCAQ DRCNAKUJLT 120 

10 SRAU>PACaiB SAYPPMGVEC YSCVGItSREA CQ6TSPPWS CYHASDHVYK GCFDQIVTLT 180 

AAmrrvSLPV RGCVQDEPCT RDGVTGPGFT LSGSCCQGSH Q^SOLRNKIY FSPRIPPLVR 240 

LPPPEPTTVA STTSVrrSTS APVRPTSTTK PMPAPTSQTP RQGVEHEA8R OEEFRLTGGA 300 
AGHQDR5NSG QYFAKGGPQQ PHKIQSCVAPT AGIiAALLLAV AAGVLL 

IS Seq 10 NO: 169 DNA sequence 

Nucleic Acid Accession |: NM_006875 
Coding sequence t 186-1190 

1 11 21 . 31 41 51 

20 I 1 II I I 

GAATTCGGCA CGAGCGCGCX3 GOGAATCTCA ACGCTGOGCC GTXTGOMGC GCTTCCGGGC 60 

CACCAGTTTC TCTGCTTTCC ACCCTGGOGC CCCCCAGCCC TGGCTCCCCA GCTGCGCTGC 120 

CCCGGGCXSTC CAOGCX^CIGC G66CTTAG0G GGTTCAGTGG GCTCAATCTG OSCtiGCGCCA IBO 

CCTCCATGTT GACCAAGCXTT CTACAG06GC CTCCCGCGCC CXXX3GGGACC CCCAC6CCGC 240 

25 CGCCAGGAGG CAftGGATOGG GAAGOGTTOG AOGCCXSAGTA TCSSACTOGGC CCCCTCCTGG 300 

GTAAGGGGGG CTTTGGCACC GTCTTCX3CAG GACACCGCCT CAOW3ATCGA CTCCAGGTGG 360 

CCATCAAAOT GATTCCCOGG AATOGTGTGC TGGGCTGGTC CCCCTTGTCA GACTCAGTCA 420 

CATGCCCACT OGAAGTOGCA CTGCTATGGA AAGTGGGTGC AGGTGGTGGG CACCCT6G0G 480 

TGATCOeCCT GCTTGACrGG TTTGAGACAC AGGARGGCTT CATGCTGQTC CTOGAfiOGGC 540 

30 CTTTGCCC3GC CCAGGATCTC TTTGACTATA TCACAGAGAA GGGCCCACTG GGTGAAGGCC 600 

CAAGCOGCTG CTTCTTTGGC CAAGTAGTGG CAGCCATCCA GCACTGCCAT TCCOGTGGAG 660 

TTGTCCAT03 TGACATCAAG GATGAGAACA TCCTGATAGA CCTA0GC06T GGCTGTGCCA 720 

AACTCATTOA ■m i t SC ril' Cr 6GTGCCCIGC TTCATGATGA AGCCTACACT GACTTTCATO 780 

6GACAAGGGT GTACAGCXXC CCAGAGTGGA TCTCTCGACA CS»GTACCAT GCACTCOOOG 840 

35 CCACTGTCIG GTCACTGGGC ATCCTCCTCT ATGACATG6T GTGTGGGGAC ATTCCCTTTG 900 

AGAGGGACX:a GGAGATTCTG GAAGCTGAGC TCCACTTCCC AGCCCATGTC TCCCCAGACT 960 

GCTGTGCXXrr AATCCGCCGG TGCX:TGGCCC CCAAACCTTC TTCCCGACCC TCACTGGAAO 1020 

AQATGCIGCT GGACCCCTOG ATGCAAACAC C3VCGOSA6GA TGTTACCCCT CAACCCCTCC 1080 

AAAGGAGGCC CTGCCCCTTT GGCCTGGTOC TTQCTACCCT AAGCCTQOCC TGQCCTOGCC 1140 

40 TGGCCCCCAA TGGTCAGAAG AGCCATCCCa TGGCCATGTC ACAGGGATAG ATGGACATTT 1200 

GTTGACTTGG TTTTACAGGT CATTACCAGT CATTAAAGTC CAGTATTACT AAGGTAAGGG 1260 

ATTGAQGATC AGG6GTTAGA AGACATAAAC CAAGTTTGCC CAGTTCCCTT CXXAATCCTA 1320 

CAAAGGAGOC TTCCTCCCAG AACCTOTOGT CCCTGATTTT GGAGGGGGAA CTTCTTOCTT 1380 

CTCATTTTGC TAAGGAAGTT TATTTTGGTG AAGTTGTTCC CATTTTGAOC CCCGGOACTC 1440 

45 TTATTTTGAT GATGTGTCAC CCCACATTGG CACCTCCTAC TACCACCACA CAAACTTAGT 1500 

TCATATGCTT TTACTTGGGC AAGGGTGCTT TCCTTCCAAT ACCCCAGTAG CTTTTATTTT 1560 

AGTAAAGGGA CCCTTTCCCC TAGCCTAGGG TCCCATATTG GGTCAAGCTG CTTACCTGCC 1620 

TCAOC3CCA08 ATTTTTTATT TIQQQG6AGG TAATGCOCTG TTGTTACOCC AAGGCTTCrT 1680 

rm ' m TTT TTTTTTTTTO GGTGAOGGGA CXJCTACTTTG TTATCCCAAO TOCTCTTATT 1740 

50 CTGGTGAGAA GAACCTTAAT TCCATAATTT GGGAAGGAAT GGAAGATGGA CACCACCGGA IB 00 

CACCACCAGA CAATAGGATG GGATGGATGG TTTTTTGGGG GATGGGCTAG GGGAAATAAG 1860 

GCTTGCTGTT TGTTTTCCTQ GOGCGCTCCC TCCAATTTTG CftGATTTTTG CAACCTCCTC 1920 

CTGAOCOGGG ATTGTGCAAT TACTAAAATG TAAATAATCA 06TATTGT66 GGA66GGAGT 1980 

TCCRAGTGTO CCC TOCTTTT TTTTCCTGOC TQGATTATTT AAAAAGCCAT GTOTOQAAAC 2040 

55 CCACTATTTA ATAAAACTAA TAGAATCASA AAAAAAAAAA AAAAAAAA 



60 



Seq ID NO: 170 Protein sequence: 
Protein Accession ft: NP 006866 



1 11 21 31 41 51 

I I i I i I 

MLTKPLQGPP APFGTPTPPP OGKDREAFEA EYRLGPLLGK G6FGTVFAGB RLTDRLQVAI 60 

KVIPBNSVIjG WSPIiSDSVTC PLEVALIiWKV GAGGGHPGVX RLLDWPETQB GFMLVLERPL 120 

65 PAQDLFDYIT EKOPLGBGPS RCFFGQWAA IQHCHSRGW HHDIKDENIL IDLRRGCAKL 180 

IDPGSGALLH DBPYTDFDGT RVYSPPEWIS RHQYHALPAT VWSLGILLYD MV06DIPPER 240 

DQBILEAELH FPAHVSPDCC ALIRRCIiAPR PSSRPSLEBZ IiUJPNMQTPA EDVTPQPLQR 300 
RPCPF6LVLA TIiSIAfiTPGLA PNGQKSBPMA MSQO 

70 Seq ID NO: 171 DNA sequence 

Kudeic Acid Accession ^: MM_003646 
. coding sequence: 89.. 2875 " 

1 11 21 31 41 51 

75 I i 1 I 1 1 

G0GG06CGGA GCGGGCGTGC TGAGCCCOGG COGCCGGCCC GGCATGGGCG TCTCCCGCGG 60 

GCCCTCCGCC GGCCG6GGCT AGGGCCG6AT GQA0CC60G6 GACGGTAGCC CCGA66CCC6 120 

GAGGA6CGAC TCCGAGTOGG C T fCUJCCTC GTCCAGOOGC TC06AS060S ACSC06GTCC 180 

CGAGCCG6AC AAC56CGCCGC GGCGACTCAA CAAGCGGOOC TTCCCGGGGC TGOGGCTCTT 240 

80 CQGGCACAG6 AAAGOCATCA CCAAOTOSGG OCTCCAGCAC CTGGCCCCCC CTCCGCCCAC 300 

CCCTGGGGCC COGTGCAGCG AGTCAGAGCG GCAGATCCGG AGTACAGTGG ACTGOAGOGA 360 

GTCAGC6ACA TATGGGGAGC ACATCTGGTT OGAGACCAAC GTGT CCGGQ G ACTTCTQCTA 420 

CGTTGGGGAG GAGTACTGTG TA6CCAGGAT GCTSAAOTCA GTGTCTOGAA GAAAiSTQOOC 480 

AGCCTGCAAG ATTGTGGTGC ACAOGOCCTO CATOSAGCAO CTGGAflAAGA T AAATTT COS 540 

85 CTGTAAGCOG TCCTTCOSTG AATCAQGCTC CAGGAATGTC OGOGAGCCAA CCTTTGTA06 600 

GCACCACTGG GTACACAGAC GACGCCAGGA OGGCAAGTGT CGGCACTGTG GGAAGGGATT 660 

0CAGCAGAA6 TTCACCTTCC ACA6CAAG6A GATTGTQ6CC ATCAGCTGCT 0GT0QT6CAA 720 
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GCAG6CATAC CACAGCAAGG TGTCCT6CTT CAT6CTGCA6 CAGATOGAGS AGCOGTGCTC 780 

GCTGGGGGTC CA08CA6CCG TGGTCATOOC GCCCACCIGG A7CCTC0Q06 CCOS G AGGCC 640 

CCAGAATACT CTGAAAGCAA GCAAGAAGAA GAAGAGGGCA TCCTTCAA6A GGAAGTCX?^ 900 

CAAGAAAGGG CCTGAGGAGG GCC3GCTGGAG ACCCTTCATC ATCAGGCCCA CCCCCTCCCC . 960 

GCTCATGAAG CCCCTGCTGG TGTTTGTGAA CCCCAAGAGT OGGGGCAACC AGGGTGCAAA 1020 

GATCATCCAG TCTTTCCTCr GGTATCTCAA TCOCOGACAA OTCTTOGAOC TOAGOCAGGG 1080 

AGGGCCCAA6 GAGGOGCTGG AGATGTA006 CAAAOTGCAC AAGCTG06GA 7CCTGG0GTG 1140 

OGGGGGOGAC GGCA06GTGG GCT6GATCCT CTCCACCCT6 GACCAGCTAC GCCTGAAGCC 1200 

GCCACCCCCT GTTGCCATCC TGCCCCTGGG TACTGGCAAC GACTTGGCCC GAAOCCTCAA 1260 

CTGGGGTGGG GGCTACACA6 ATGAGCCTGT GTCCAAGATC CTCTCCCAOS TGGAGGAGGG 1320 

GAAOGIGGTA CSUSCIGGAOC GCTGGCS^OCT 0CA06CTGAG OCCAAOCCOO AGGCAOGGCC 1380 

T6AGGAC0GA GATGAAGGC6 CCACOGAOCG GTTGCCCCTG GATGTCTTCA ACAACTACTT 1440 

CAGCCTGGGC TTTGAOGCCC AOGTCACCCT GGAGTTCCAC GAGTCTCGAG AGGCCAACCC 1500 

AGAGAAATTC AAOWCQGCT TTOGGAATAA GATGTTCTAC GCOGGGACAG CTETCTCTGA 1560 

CTTCCTGATG GGCAGCTCCA AGGACCTGGC CAAGCACATC CGAGT6GTGT GTGATGGAAT 1620 

GGACTTGACT OOCAAGATOC AGGACCTGAA ACCCCAGTgT GTTGTTTTCC TGAACATCCC 1680 

CAG6TACTGT GGGGGCACCA TGCCCTGGGG CCACCCTGGG GAGCACCAOG ACTTT6AGCC 1740 

CCAGOGGCAT GAOGACGGCT ACCTCGAGGT CATT66CTTC ACCATGAC3GT CGTTGGCCGC 1800 

GCT6CAGGTG GGCXX5ACACG GCGAGCGGCT GACQCAGTGT OGCGAGGTGG TGCTCACCAC I860 

ATCCAAGGCC ATCCOGGTGC AOGTGGATGG CGAGCCCTGC AAGCTTGCAG CCTCAOGCAT 1920 

G0GCATC6CC CTGOGCAACC AGGCCACCAT GGTGCAGAA6 GOCAAGGGGC G6AG06CXXSC 1980 

CCCCCTGCAC AGCGACCAGC AGCCGGTGCC AGAGCAGTTG CGCATCCAGG TGAGT060GT 2040 

CAGCATGCAC GACTATGAGG CCCTGCACTA OGACAAGGAO CAGCTCAAGG AGGOCTCTGT 2100 

GCOGCTGGGC ACTGTGGTGG TCCCAGGAGA CAQT6ACCTA GAGCTCTGCC GTGCXXavCAT 2160 

TGAGAGACTC CAGCAGGAGC CCGATGGTGC TGGAGCCAAG TCCCCGACAT GCCAGAAACT 2220 

GTCCCCC3VAG TGGTGCTTCC TGGACGCCAC CACTGCCAGC CGCTTCTACA GGATCGACCG 2280 

AGCCCAGGAG CACCTCAACT ATGTGACTGA GATCGCACAG GATGAGATTT ATATCCTGGA 2340 

CCCTGAGCTG CTGGGGGCAT OGGCCOQGOC TGACCTOCCA hCCCCO^CTC CCOCTCTCCC 2400 

CACCTCACXX; TGCTCACCCA OGCCCOGGTC ACXGCAAGGG GATGCTGCAC CCCCTCAAGG 2460 

TGAAGAGCTG ATTGA6GCTG CCAAGAGGAA CGACTTCTGT AAGCTCCAGG AGCTGCACCG 2520 

AGCTGGGGGC GACCTCATGC ACCGAGAOSA GCAGAGTOGC AC3GCTCCTGC ACCACGCAGT 2580 

CAGCACTGGC AGCAAGGATG TGGTC06CTA CCTGCTGGAC CACGCCCCCC CAGAGATCCT 2640 

T6ATG0GGTG GAGGAAAACG GGGAGAOCTG TTTGCAOCAA GCAGOGGCCC TGGGOCAGOG 2700 

CACCATCT6C CACTACATOS TGGAGGOOOG G6CCT0CSCTC ATGAAGACAG ACCAGCAGGG 2760 

CGACACTOCC GGGCA606GG CTGAGAAGGC TCAGGACACC GAGCIGGCOG CCTACCTGGA 2820 

GAACGGGCAO CACTACCAGA TGATCCAGOS GGA6GACCAG GAGACGGCTG TGTAGCGGGC 2880 

Seq ID NO: 172 Protein sequence: 
Protein Accession fft liF_003637 

1 11 21 31 41 51 

I 1 I I I I 

MEPfiXXSSPBA RSSDSESASA SSSGSERDAG PEPDKAPRRIi NKRRFPGLRb FGHRKAZTKS 60 

GLQHLAPPPP TPGAPCSBSB RQZRSTVDHS BSATYGEHIH FETNVSGDFC YVGBQYCVAR 120 

MLKSVSRRKC AACKIWHTP CIEQLfiiCINF RCKPSPRBSG SRNVRBPTFV HHHHVHRKRQ 180 

DGKCRHCGKG PQQKPTFHSK EIVAISCSWC KQAYHSKVSC FMLCXJIEEPC SLGVHAAWI 240 

PPTWILRARR PQimiKASKK KKHASFKRKS SXKGPEBGRVI RPFIIRPTPS PLKKPLLVFV 300 

NFKSGGKQGA KZZQSPLmrL NPROyFDLSQ GGPKBALEtff RJCVHNLRIIA 0G6DGTVGWI 360 

I.STLDQLRLR PPPPVAILPL GTGNDLARTL MWGGGYTDEP VSKILSHVEE GNVVQUDRHD 420 

LHAEFNPEAG PEDRDEGATD RLPLDVTNNY FSLGFDAHVT LEFHESREAN PHKFN5RFRN 480 

KMFYAGTAFS DFLKGSSKDL AKHIRWCDG MDLTPKIQDL KFQCWFLNI PRYCAGTMPW 540 

GBPGEHHDPE PQRiQ3DGYLB VIGFlTfTSLA ALQVGGKGER LTQCREWLT TSKAIPVQVD 600 

GSPCKLAASR ZRZAIiRMQAT MVQXAKRRSA APLBSDQQPV PBQLRIQVSR VSMHDYEALH 660 

YDKEQLRBAS VPLGTVWPG DSDLBLCRAH XERLQQEPDG AGAKSPTGQK LSPKNCFU3A 720 

TTASRPyRID RAQEHLNYVT EIAQDEIYIL DPELXiGASAR PDLPTPTSPL PTSPCSPTPR 780 

SLQGDAAPPQ GEEZjIEAAKR NDFCKLQBLB RAGGOLMHRD EQSRTLIiKHA VSTGSKDWR 840 

YLliDHAPFBI LDAVEENGBT CLHQAAALGQ RTICHYZVEA GASLMKTDQQ GDTPRQRAER 900 
AQDTELAAYL BNRQHYQMZQ REDQETAV 

Seq ID ttOi 173 DKA sequence 
Nucleic Acid Accession 8: AF232772 
Coding sequence: 1-1662 

1 11 21 31 41 51 

11)111 

ATGGOGGTGC AGCTGACGAC AGCCCTGOGT GTGGTGC^SCA CCAGCCTGTT TGCOCTGGCA 60 

GTGCTGGGTG GCATCCTGGC AGCCTATGTG A0GG6CTACC AGTTCA7CCA CAO GG AAAAG 120 

CACTAC CTOT C CTTOO QOCT GTAOGGOGOC ATCCT6QG0C TGCACCTGCT CATTCAGAGC 160 

CTTTTT6CCT TCCTGGAGCA C0660GCATG 06A0GTGC06 6CCAGGCCCT GAAGCT6CCC 240 

TCCC0GC3GGC GGGGCTOGGT GGCACTGTGC ATTGCOSCAT ACCAGGAGGA CCCTGACTAC 300 

TTGOGCAAGT GOCTGCGCTC GGCCCAGOSC ATCTCCTTCC CTGACCTCAA GGTGGTCATG 360 

GTGGTGGATG GCAACXXSCCA GGAG6A0GCC TACATQCTGG ACATCTTCCA CGAGGTGCT6 420 

GGOOOCACOQ AGCAGOCOGQ Cl ' lCrr T OTO TGG06CA6CA ACTTGCATGA G6CAGGGGAG 460 

GGTGAGAOGG AQOCCAGCCT 6CAGGAGG6C AT66AC06TG TGG6GGATGT GGTG0GG6CC 540 

AGCACXrrrCT CXSTGCATCAT GCAGAAGTGG GGAGGCAAGC GCGAGGTCAT GTACACGGCC 600 

TTCAAGGCCC TOGGOGATTC GGTGGACTAC ATCCAGGTGT GOGACTCTGA CACTGTGCTG 660 

GATCCAGCCT GCACCATCGA GATGCTTOGA GT0CIG6AGG AGGATCCXXIA A6TAGGQ6GA 720 

GTC3QGG6GAG ATGTCCAQAT 0CTCAACAA6 TAC6ACTCAT GGATTTCCTT CCTGAGGAGG 780 

GTGOGGTACT QGATG6CCTT CAAC6TGGAG 06GGCCTGCC AGTCCTACTT TGGCT6TGTG 840 

CAGTGTATTA GTGGQCCCTT GGGCATGTAC CGCAACAGCC TCCTCCAGCA GTTCCTGGAG 900 

GACTGGTACC ATCAGAAGTT CCTAGGCAGC AAGTGCA6CT TOGGGGATGA COOGCACCTC 960 

ACCAACOGAG TCCTGAGCCT TGGCTACOGA ACTAAGTATA CCGCGOGCTC CAAGT6CCTC 1020 

ACAGAGACCC CCACTAAGTA CLTCC GG TGG CTCAAOCAGC AAACCC6CTG GAGCAAGTCT 10 BO 

TACTTCOGGG AGTGGCTCTA CAACTCTCTG TGGTTCCATA AGCACCACCT CTGGAT6ACC 1140 

TAOGAGTCAQ TGGTCACGGG TTTCTTCCCC TTCTTCCTCA TTGCCACGGT TATACAGCTT 1200 

TTCTAOOGGG GOOGCATCTG GAACATTCTC CTCTTCCTGC TGAOGGTGCA GCTGGTGGGC 1260 

ATTATCAAGG CCACCTACGC CTGCTTCCTT OGGQGCAATG CAGAGATGAT CTTCATGTCC 1320 
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CTCTACTOOC TCCTCTATWT GTCOkOOCTT CTGOOGGCOV AG21TCTTTGC CATTGCIACC 1380 

ATCAACAAAT CTGGCTGGGG CACCTCTQGC OGAAAAACCA TTaTGGTGAA CTTCATTGGC 1440 

CTCATTCCTG TGTCCATCTG GGTGGCAOTT CTCCTGGAGG GGCTGGCCTA CACAGCTTAT 1500 

TGCCAGGAOC TGTTCAGTGA GACAGAGCTA OCCTTCCTTG .TCTCTGGGGC TATACTGTAT 1560 

GGCTGCTACT GGGTGGOCCT CCTCATGCTA TATCTGGCCA TCATCGCCOG 6CGATGTGGG • 1620 

AAGAA6C0G6 AGCAGTACA6 CTTGGCTTTT GCTGAGGTOT GACAT6GCCC CCAAGCAGAG 1680 

CGGGTAAAGT GCAATGG6TA AGGGA6G6AA GGGGAATGGA A6AGAAAAGA GAOOCTGGGA 1740 

tSGGAGGAGGG ACTGCTGTGT TTTRGTCTCT TAATGGTCCA AAOGACAAAT CTAAAATGCA 1800 

AAGAAOGGTG ATGTAGTATG GCCTGACAGC TCTGTTTAGA GGAGGCAACA CTGATCCCCC 1860 

A6ATGCAGGG CT6CA6GGGA TTCTGTGTTT TCA6ACTGCC TGTCTGCTT6 CATCTGCACA 1920 

TAGGCAGTAO CCTCCTCCIG 6GCTCCAGA6 GGCACTCAGA AGITGTGCTA AACCAAGTTA 1980 

A6TCCCATTC AGTGGCAACT TGXGATAOGT ACCTGASTGA 06GCAACCT6 0G6AAGGAGG 2040 

TTCTCCCAGC CCATCTGAAC ACAACCAGAG GTGGCAGGAG AATTTCTACT GAG06AGGTG 2100 

GGCCGGTTAG TGTATGTCAC CXXXZACCCCA CCCATAAGTA GTCATCAATC CAATAA6ATT 2160 

GCGOGTGAGA TACAA6GGCC AGAA6CCTGA TCTTTGG6CA TCAGAAAACA GGGTCCAGGA 2220 

ATGGTGCTTT ATGTGAGATA CXXXACTCCA CATCAACATT CCAiGGGAIGA GCCAAACCAO 2280 

CAGGGACTTA GCACTGAACT GCTTTTAAAA GTGCACATTA AAAAGGAAAG TTT6CCAGGA 2340 

GGAACAAAGA GATTGTGGTG GTGCTAAAGG AGGCCATAAQ CTACACAGAO GCCTTGGGTG 2400 

TTCCACCTGQ AAACTGCTCA GAOGTCTAGA TGGGTTCTTA GCTTGTCTGT GATCTCTGCT 2460 

GGGGAGATAA AAAGATTAAG CXXXAACATG TTCAGAAAAO AAGTGAAGTC TTGGGTATTT 2520 

TAACCTGTAT ACTCTTCAAT TCCTCTCAAA TTCAGCTCTG ATCTGAGGCT AA GACAC ACT 3580 

OXCACTTCA CTTTCTTCAA AGCCACATTT TTTGAGGTAT CACTGCAGTC ACCTCTTCTA 2640 

CCCTCATCAT CATAflGTAAG GTTTTCAAGG TOGCAATTGG GGCGGAGCCC OGGCTTCTTA 2700 

TAOAAGCTTC AOCAGGAGGC AAGOGTGTTC TCAGCACATA TGGGAACTAT GAGGAGCCTC 2760 

TGATCAAATT GGCTACAATC TTGGAGCTGC TTGGACX5GAT TCCTTGGCAG CCGGGTTAGC 2820 

ATGTGTGACT TTCAGGCTAC TGTTCTTGAC AATCATCTCC AATGGAAAGC TTTTCA6TGT 2880 

TCCCAAAGTG AACTCTCAAA TCCAAAATGG TTATCTTTGA GACCATCCAT TCTCCTCAGT 2940 

QGCTTCTCCA GG6AATTCTT ACAGCCAAGT TGTGACAGTC ACTGCATTTG CCT6CTTCTT 3000 

TCCAGAAACC AAACTAGGAO ATGAAACTGG TTCCTACATC CTAAGGTTCT T6CTTTCTCT 3060 

CTCATGCCTC CTGAGGCTGT TTTTGGCTGT TTTCCCTCTG CTGCTTTTGG GGAATGAGGG 3120 

GAAGCCATTT TCCAAGTGAC TTGCAATCCA GQCT6TTCTC AGCGTTTT6A GTTTAAAACC 3180 

TGGGATCCTG ACTAAGCCTT TGACTTAAGG GTTGCTTGCT TGCCCTCCAA ATGTCSrTTTC 3240 

TCAAAGG66C CAACTAAOCC OTGCAGAAOC AGCACTAAGG TG6ACA6CAG ACAAGAGGGC 3300 

AAGCCTCTAA TGTAOCAAST GCTTCCTACA AAGACGCAAG GTGTGCTCOG AACCACAGAT 3360 

GGGCAAACCC TGGTGCTTTC CTTCATCTCC CAOGAACTCA AGGGTTTTCC AAGTGTAGCT 3420 

AACAGTTGCC ACATCACACA GACCTCCAGT TTCTGOTAAG ACTGCTGGTT GACATCAGAC 3480 

CCAACCCATT GAAGGCTGGA AGGCAGCAGG CATTTGCTAA GGCAGCTGAT CCAGGCAATC 3540 

GTTCTGCTGG CCAAGAAGTT AAACTATTTT GAGCATT3MSA ATGGAGGAAA TCCX3GTCAGC 3600 

CAACTGCAGA GTTCAGACTT OGCTAAGGGC TTCTTTTTCT TCAOCATTTA CT TGAAG ATT 3660 

AATGTAGGAT GACAGGCTCT CCTG6CT6TC CTACCATCAG CTCT6CCTT0 CA CTGT GGTC 3720 

GTCAACTTTC CTCAAATCAA AAACAGGCAG GTACAGGTAG TGGGCTCACA ACGTTTGACC 378Q 

TOGACTGGTT TTTCTAAGTT ATTTTGTACA TTTTTCAGCA GCAAAACCAA ACTGGGTCTT 3840 

CAGCTTTATC CCOGTTTCTT GCAA666AA0 A6CCTTTATA CAATTGGACG CATTTTGOTT 3900 

TTTCCTCATT GAQAATTCAA ATCCTCTTTT GTATTGTTTC TACAATAATT TGTAAACATA 3960 

TTTATTTTTA OUTUCTrfn ' TTTTTTTTTI TAATTTTCAG GTCAAGTTTT TTATACTGCA 4020 
CTTATTTGTC AAAATAAAGA TTCTCACAT 

Seq ZD NOt 174 Protein sequence t 
Protein Accession #i AAF36984 

1 11 21 31 41 51 

I I I I I I 

MPVQLTTALR WGTSLPAIiA VIiGOZLAAYV TGYQFIETER HyLSFGLYGA ZLGLHLLIQS 60 

LFAFLEBHRM RRAGQALKLP SPSRGSVAIiC lAAYQEDroY LRKCLRSAQR ISFPDLKWM 120 

WDGNRQBDA YMUDIPHEVL GGTEQAGPPV WRSNPHBAGE GETEASLQBG MDRVRDWRA 180 

STFSCIKQKW GGKREVMYTA FKAL6DSVDV lOVCDSDTVL DPACTIEMliR VLEEDPQVGG 240 

VGGDVQILNK YDSHISFLSS VRYWMAFNVE RACQSYFGCV QCISGPI iGMy RNSIiWQFLB 300 

DWYHQECFLGS KCSFG DDRHI . TKRVLSLGYR TKYTARSKCL TETPTKnAH UIQQTRWSKS 360 

yFREWLYNSti WFHKHRZiHm YSSWTGFFP FFLZATVIQIi FYRGRZHNZL LFLLTVQLVG 420 

IIKATYACPL RGNAEMZBMS IiYSUiYMSSIi LPAKZFAIAT ZNKSGW6TSG RKTIWHFIO 480 

LIPVSIWVAV UiBGXAYTAY CQDI«FSBTEL APLVSGAZLY GCyWVAIiLML YLAZZARRC30 540 
KKPBQYSLAF ASV 

Seq ZD N0< 175 SNA sequence 
Nucleic Acid Accession S< MH_000691 
Coding sequence : 43 . . 1404 ^ 



1 11 21 31 41 51 

} I 1 t 1 i 

CCAGGAGCCC CAGTTACC6Q GAGAGGCTGT GTCAAAGGCG CCATGAGCAA GATCAGOGAG 60 

GCCGTGAAGC GOSCCCXSCGC CX3CCTTCAGC TCGGGCAGGA CCC3GTCCGCT GCAGTTCCGA 120 

TTCCAGCAGC TGGAG6CGCT GCAGGGCCIG ATCCAGOAGC AGOAGCAGGA GCTGGTG06C 180 

GCGCTGGCCG CA6ACCTGCA CAAGAATGAA TGGAAOOCCT ACTAT6AG6A GGTGGTGTAC 240 

GTCCTA6AGG AGATCGAGTA CATGATCCAG AAGCTCCCTG AGTGGGCOGC GGATGAGCCC 300 

GT6GAGAAGA OGCCCCAGAC TCAGCAGGAC GAGCTCTACA TCCACTCGGA GCCACTGGGC 360 

GTGGTCCTCG TCATTGGCAC CTGGAACTAC CCCTTCAACC TCACCATOOl GCCCATGGT6 420 

GGOGCCATCG CTGCAGG6AA 06CAGTGGTC CTCAAGCCCT C3GGA6CTGAG TGAGAACAT6 480 

GCGAGCXTTGC TGGCTACCAT CATCCCCCAQ TACCTGGACA AG6ATCTGTA CCC AGTA ATC 540 

AATGGQGGTG TCCCTGAGAC CAOGGAGCTG CTCAAGGAGA G6TTC36ACCA TATOCTGTAC 600 

A06GGCAGCA OGGGGGTGGO QAAGATCATC ATGACGGCTG CTGCCAAGCA CCT GACCC CT 660 

GTCAOQCTGO AOCTGGGAGG GAAGAGTCCC TGCTACXTTGO ACAAGAACTG TGACCTOGAC 720 

GTOGCCTGCC GACGCATCGC CTGGGGGAAA TTCATGAACA GTGQOCAQAC CTGOGTGGOC 780 

OC3W5ACTACA TCCTCTGTGA CCCXTTOGATC CAGAACCAAA TTGT0GA6AA GCTCAAGAA6 840 

TCACT6AAA6 AGTTCTA0G6 GGAAGATGCT AAGAAATCCC 6GGACTATGG AA GAAT CATT 900 

AOTGCCO GG C ACrTOCASAG GGTGAT0G8C CTGATTGAGG GCCAGAAGGT GGCTTATGGG 960 

GGCACCX3G66 ATGCGGOCAC T06CTACAIA GCCCCCACCA TCCTCACGGA OGT6GACXX3C 1020 
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CAGTCCOOGG T6IVT6CAAGA OGAGJlTCrrC GGGCCTGTOC TGCCOITOST OTGOGTGCOC 1080 

AGCCTG6A6G AGOCOlTCai GT7CATCAAC CAGC3GTGAGA AGCCCCTGGC CCTCIACATG 1140 

TTCTCOWSCA AOGACAAGGT GATTAAGAAG ATGATTGCAG AGACAT0C3iG TGGTGGGGTO 1200 

GCGGCCAAOG ATGTCATOGT CCACATCACC TTGCACTCTC TGCCCTTCGG GGGCGTGGGG 1260 

AACAGCGGCa TGGGATCCTA CCATGGCAAG AAGAGCTTOG AfiACTTTCTC TCACCGC06C 1320 

■i CT rG C CTG G TGAGGCCrCT GATGAATOAT GAAG6GCTGA AGGTCAGATA CCCCCO^GC 1380 

CCGG<XAA6A TGACCCAGCA CTGAGGAGGG GTTGCTCOSC CTGGOCTGGC CATACTGTGT 1440 

CCCATCGGAG T6CGGACCAC CCTCACTGGC TCTCCTGGCC CTGGAGAATC GCTCCTGCaO 1500 

CCCCAGCCCA GCCCCACTCC TCTGCTGACC TGCTGACCTG TGCACACCCC ACTCCCACAT 1560 

GGGCCCAGGC CTCACCATTC CAAGTCTCCA CCCCTTTCTA GACCAATAAA GAGACAAATA 1620 
CAATTTTCTA ACT0G6 



Seq ID NO: 176 Protein sequezicet 
Protein Accession ff: NP_0006B2 



1 11 21 31 41 51 

111)11 

MSKISEAVKR ARAAPSSGRT RPLQPRFQQL EALQRLIQEQ EQBLVGALAA DLHKNEMNAY 60 

YEBWYVLSE ISWIQIOiPE WAADEPVBKT PQTQQDELYI HSEPLGVVX.V I GTWNY PFMI; 120 

TIQPMVGAIA AGNAWLKPS ELSEMMASU* ATIIPQYLDK DLYPVINGGV PETTBLLKER 180 

FDHILYTGST GVGKIIMTAA AKHLTPVTLB LGGKSPCYVD KNCDLDVACR RIAW6KFMIS 240 

GQTCVAPDYI LCDPSIQIIQI VEKLKKSLKB FYGEDAKKSR DYGRIISAHH PORVMSLIBG 300 

QKVAYGGTGD AATRYZAPTI LTDVDPQSPV MQEEIFGFVL PIVCVRSLEB AIQFINQREK 360 

PLAIiYMFSSH DKVIKKMIAB TSSGGVAA23D VIVHITLBSL PFGGVGNS6H GSYHGKKSFE 420 
TPSHSRSCbV RPIUNDEGIiK VRYPPSPAKM TQH 



Seq ID HO: 177 DHA sequence 

Nucleic Acid Accession ft: NM_001067.1 

Coding sequence i 108-4703 

1 11 21 31 41 51 

I I I 1 I I 

CTAACC6A0G OGCGTCTGTG GAGAAGGGGC TTOGTOQGGG GIOGTCT06T GGGGTCCTGC 60 

CTGTTTAGTC GCTTTCAOGG TTCTTQAOCC CCTTCAOSAC OGTCACCATG GAAGTGTCAC 120 

CATTGCAGCC TGTAAATQAA AATATGCAAG TC3WICAAAAT AAAGAAAAAT GAAGATGCTA 180 

AGAAAAGACT GTCTGTTGAA AGAATCTATC AAAAGAAAAC ACAATTGGAA CATATTTTGC 240 

TC0GCCCA6A CACCTACATT GGTTCTGTGG AATTAGTGAC CCAGCAAATG TGGGTTTA06 300 

ATGAAGAT6T TGGCATTAAC TATAGGGAAG TCACTTTTGT TCCrOGTTTG TACAAAATCT 360 

TTGATGAGAT TCTAGTTAAT GCTGCGGACA ACAAACAAAG GGACCCAAAA ATGTCTTGTA 420 

TTAGAGTCAC AATTGATCOS GAAAAC3ATT TAATTAGTAT ATGGAATAAT GGAAAAGGTA 480 

TTCCTGTTGT TGAACACAAA GTTGAAAAGA TGTATGTCCC AGCTCTC3VTA TTTGGACAGC 540 

TOCTAACTTC TAGTAACTAT GATGATGATG AAAAGAAAGT GACA6GTGGT CGAAATGGCT 600 

AK36AGCCAA ATTGT6TAAC ATATTCAGTA CCAAATTTAC TGTGGAAACA GCCAGTAGA6 660 

AATACAAGAA AATGTTCAAA CAGACATGGA TGGATAAIAT GGGAAGAGCT GGTGAGAT66 720 

AACTCAAGCC CTTCAATQGA GAAGATTATA CATGTATCAC CTTTCAGCCT GATTTGTCTA 780 

AGTTTAAAAT GCAAAGCCTG GACAAAGATA TTGTTGCACT AATGGTCAGA AGAGCATATO 840 

ATATT6CTGG ATCCACCAAA GATGTCAAAG TCTTTCTTAA TGGAAATAAA CTGCCAGTAA 900 

AAGGATTTOG TAGTTATGIG GACATGTATT TGAAGGACAA 6TTGGATGAA ACTGGTAACT 960 

CCTT G AAAGT AATACAT6AA CAA6TAAACC ACAGGT0G6A AGTGTGTTTA ACTAT6AGTG 1020 

AAAAAGGCTT TCAGCAAATT AGCTTTGTCA AC3VGCATTGC TACATCCAAG GGTGGCAGAC 1080 

ATGTTGATTA TGTAGCTGAT CAGATTGTGA CTAAACTTGT TGATGTTGTG AAGAAOAAGA 1140 

ACAAGGGTGG TGTTGCAGTA AAAGCACATC AGGTGAAAAA TCACATGTGG ATTTTTGTAA 1200 

AT6CCTTAAT TGAAAACCCA ACXnTTGACT CTCAGACAAA AGAAAACATG ACTTTACAAC 1260 

CCAAGAGCTT TGGATCAACA TGOCAATTGA 6T6AAAAATT TATCAAAGCT GCCATTGGCT 1320 

GTGGTATTGT AQAAAGCATA CTAAACTGGG T6AAGTTTAA GGCCCAAGTC CAGTTAAACa 1380 

AGAAGTGTTC AGCTGTAAAA CATAATAGAA TC3iAGGGAAT TCCCRAACTC GATGATGCX» 1440 

ATGATGCAGG GGGOOGAAAC TCCACT6AGT GTACGCTTAT CCTGACTGAG GGA GATTC AG 1500 

CCAAAACTTT GGCTGTTTCA GGOCTTGGTG TGGTT0G6AG AGACAAATAT GGOOTTTTCC 1560 

CTCTTAGAGG AAAAATACTC AATGTT0GA6 AAGCTTCTCA TAAGCAGATC ATG6AAAATG 1620 

CTGAGATTAA CAATATCATC AAGATTGTGQ GTCTTCAOTA CAA6AAAAAC TATGAAGATG 1680 

AAGATTCATT GAAGACX5CTT OSTTATGGGA AGATAATGAT TATGACAGAT CAGG ACCAAG 1740 

ATGGTTCCCA CATCAAAQQC TTGCTGATTA ATTTTATCCA TCACAACT6G CCCTCTCTTC 1800 

TGCGACATCO TTTTCTGGAQ GAATTTATCA CTCCCATTGT AAAGGIATCT AAAAACAAOC 1860 

AAGAAATGGC ATTTTACAGC CTTCCTOAAT TTGAAQAGTG GAAGAOTTCT ACTCCAAATC 1920 

ATAAAAAATG GAAAGTCAAA TATTACAAAG GTTTGOGCAC CAGCACATCA AAGGAAGCTA 1980. 

AAGAATACTT TGCAGATATG AAAAGACATC GTATCCAGTT CAAATATTCT GGTCCTGAAG 2040 

ATGATGCTGC TATCAGCCTG GOCTTTAGCA AAAAACAGAT AGAT GATO GA AAGGAATGGT 2100 

TAACTAATTT CATGGAGGAT AGAAGACAAC GAAAGTTACT TGGGCTTCCT 6AGGATTACT 2160 

TGTATOGACA AACTACCACA TATCTGACAT ATAATGACTT CATCAACAAO GAACTTATCT 2220 

TGTTCTCAAA TTCI6ATAAC GAGAGATCTA TCCCTTCTAT GGTGGAT6GT TTGAAACCAG 2280 

GTCAGAGAAA GGrTTTGTTT ACTTGCTTCA AACX3GAATGA CAAGOGAGAA GTAAAGGTTG 2340 

CCCAATTAGC TGGATCAGTG GCTGAAATGT CTTCTTATCA TCAT6GT6AG ATGTC ACTA A 2400 

TGATGACCAT TATCAATTTG GCTCAGAATT TTGTGGGTA6 CAATAATCTA AACCTCTTGC 2460 

AGCCCATTGG TGAGTTTGGT ACCAGGCTAC ATGGTGGCAA GQATTCTQCT AGTOCAOGAT 2520 

ACATCTTIAC AATGCTCAGC TCTTTGGCTC GATTGTTATT TCCACCAAAA GATQATCACA 2580 

06TTGAA0TT TTTATATGAT GACAACCAGC GTGTTGAGCC TC5AATGGTAC ATTCCTATTA 2640 

TTCCCATGGT GCTGATAAAT GGTGCTGAAG 6AATCGGTAC TGGGTGGTCC TGCAAAATCC 2700 

CCAACTTTGA TGTGOGTGAA ATTGTAAATA ACATCAGG06 TTTGATG6AT GGAGAAGAAC 2760 

CTTTGCCAAT GCTTOCAAGT TACAAGAACT TCAAGGGTAC TATTGAAGAA CTOGCTCCAA 2820 

ATCAATATGT OATTAGTGGT GAAGTAGCTA TTCTTAATTC TACAACCATT GAAAT CTCA G 2880 

AGCTTCCCOT CAGAACATGG ACCCAGACAT ACAAAGAACA A6TTCTAGAA CXrCATGTTGA 2940 

ATGGCACOGA GAAGACACCT CCTCTCATAA CAGACTATAG GGAATACCAT ACAGATACCA 3000 

CTGTGAAATT TGTTGTGAAG AT6ACTGAAG AAAAACTGGC AGAGGCAGAG AGAGTTGGAC 3060 

TACACAAAGT CTTCAAACTC CAAACTAGTC TCACATGCAA CTCTATGGTG CTTTTTGACC 3120 

AOGTAGGCTG TTTAAA6AAA TA3GACA0GG TGTTGGATAT TCTAAGAGAC TTTTTTGAAC 3180 

TCMSACTTAA ATATTATGGA TTAAGAAAAG AATGGCTCCT AGGAATGCTT GGTGCTGAAT 3240 

CTOCIAAACT GAATAATCAO GCTGQCTTTA TCTTAGAGAA AATAGATGGC AAAAXAATCA 3300 
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TTGAAAATAA GCCTAAGAAA GAAT1AATTA AA6TTCTGAT TOUSAGGGGA TATGATTOGG 3360 

ATCCT6TGAA GGCCT G GAAA GAA6CCCA6C AAAAGGTTCC AGATGAAGMl GAAAATGAA6 3420 

AGftGTGACAA CGAAAAGGAA ACTGAAAAGA GTGACTCC7GT AACAGATTCT GGAOCAACCT 3460 

TCAACTATCT TCTTGATATG CCCCTTTGGT ATTTAACCAA GGAAAAGAAA GATS^CTCT 3S40 

GCAGGCTAAG AAATGAAAAA GAACAAGA6C TGGACACATT AAAAAGAAA6 AGTCCATCAG 3600 

ATTTGTGGAA AGAAGACTTG GCIAGATTIA TTGAAGAATT GGAGGCTOTT (SAASOCAAGG 3660 

AAAAACAAGA T6AACAAGTC GGACTTCCTG 6GAAAOGGG6 GAAGGCCAAG GG6AAAAAAA 3720 

CACAAATGGC TGAAfSTTTTG CCTTCTCOGC GTGGTCAAAG AGTCATTCCA OGAATAACCA 3780 

TAGAAATGAA AGCAGAGGCA GAAAAGAAAA ATAAAAAGAA AATTAAGAAT GAAAATACTQ 3840 

AA6GAAGCCC TCAAGAAGAT GGTGTQGAAC TAGAAGGCCT AAAACAAAGA TTAOAAAAQA 3900 

AACAGAAAAO AOAACCAGGT ACAAAGACAA AGAAACAAAC TACATTG6CA TTTAAGCCAA 3960 

TCAAAAAAG6 AAAGAAGAGA AATCCCI66C CIGATTCAGA ATCAGATAG6 A6CAGTGA06 4020 

AAAGTAATTT TGATSTCCXTT CCAC6A6AAA CAGAGCCAOG GAGAGCAGCA ACAAAAACAA 4080 

AATTCACAAT GGATTTGGAT TCAGATGAAG ATTTCTCAGA TTTTGATGAA AAAACTGATG 4140 

ATGAAGATTT TGTCCCATCA GATGCTAGTC CACCTAAGAC CAAAACTTCC CCAAAACTTA 4200 

GTAACAAAGA ACTGAAACCA CAQAAAAGT6 TOGTGTCAGA CCTTGAAGCT GATGATGTTA 4260 

AGGGCAGTGT ACCACTGTCT TCAAGCXXrTC CTGCTACACA TTTGCCAGAT GAAACTGAAA 4320 

TTACAAACCC AGTTCCTAAA AAGAATGTGA CASTGAAGAA GAGAGCAGCA AAAAGTCAGT 4380 

CrrCCACCTC CACTACCGGT GCCAAAAAAA GGGCTGCCCC AAAAOGAACT AAAAGGGATC 4440 

CAGCTTTGAA TTCTGGTGTC TCTGAAAAGC CTGATCCTGC CAAAACCAAG AATOGCOGCA 4500 

AAAOGAAGCX: ATCCACTTCT GATGATTCTG ACTCTAATTT TGAGAAAATT GTTTOQAAAO 4560 

CAGTGACAAG CAAGAAATCC AAGGG06AGA OTGATGACTT CCATATGGAC TTTGACTCAG 4620 

CTGTGGCTCC TCGGGCAAAA TCTGTACQGG CAAAGAAACC TATAAAGTAC CTGGAAGAGT 4680 

CAGATGAAGA T6ATCTGTTT TAAAATGTGA GGCGATTATT TTAAGTAATT ATCTTACCAA 4740 

6CCCAAGACT GGTTTTAAAG TTACCTGAAG CTCTTAACTT CCTCCCCTCT GAATTTAGTT 4 BOO 

TGGGGAAGGT GTTTTTAGTA CAAG A CATCA AAGTGAAGTA AAGCCCAA6T QTTCTTTAGC 4B60 

TTTTTATAAT ACT6TCTAAA TAiOTGACCAT CTCATGGGCA TTGTTTTCTT CTCTGCTTTO 4920 

TCTGTGTTTT GA6TCTGCTT TCTTTTGTCT TTAAAACCTG ATTTTTAAGT TCTTCTGAAC 4980 

TGTAGAAATA GCTATCTGAT CACTTCAGOG TAAAGCAGTG TGTTTATTAA CCATCCACTA 5040 

AGCTAAAACT AGAGCAGTTT GATTTAAAAG TGTCACTCTT CeTCCTTTTC TACTTTCAGT 5100 

AGATATGAGA TAGA6CATAA TTATCTGTTT TATCTTAGTT TTATACATAA TTTACCATCA 5160 

GATAGAACTT TATGGTTCTA GTACAGATAC TCTACTACAC TCASCCTCTT ATGTGOCAAG 5220 

TTTTTCTTTA AGCAATGAGA AATTGCTCAT GTTCTTCATC TTCTCAAATC ATCAGAGGCC 5280 

AAAGAAAAAC ACTTTGGCTG TGTCTATAAC TTGACACAGT CAATAGAATG AAGAAAATTA 5340 

GAGTAGTTAT GTGATTATTT CAGCTCTTGA CCT6TCCCCT CTGGCTGCCT CTGAGTCTGA 5400 

ATCTCCCAAA GAGAGAAACC AATTTCTAAG AGGACTGGAT TGCAGAAGAC TOGGGGACAA 5460 

CATTT6ATCC AAGATCTTAA ATGTTATATT GATAACCATX3 CTCAGCAATG AGCTATTAGA 5520 

TTCATTTTGG GAAATCTCCA TAATTTCAAT TTGTAAACTT TGTTAAGACC T6TCTACATT 5580 

GTTATATGTG TGTGACTTGA GTAATGTTAT CAAOGTTTTT 6TAAATATTT ACTATGTTTT 5640 
TCIATTAGCr AAATTCCAAC AATTTT6TAC TTTAATAAAA T6TTCTAAAC ATTGC 

Seq ID NO: 178 Protein sequence: 
Protein Accession #t NP_001058.1 

1 11 21 31 41 51 

! I i 1 ' ] I 

B4SVSFLQPVN EKMQVNKIKK NEDAKKRLSV ERIYQKKTQL EHILLRFDTY IGSVSLVTQQ 60 

MNVyDEDVGI NYREVTFVPG LYXIFDEZLV NAADEtiKQRDP KMSCIRVTID PEMNIiISINN 120 

NGKGZPWEH KVEKKYVPAIi IFGQLLTSSN YDDDBKKVTO GRNGYGAKZiC NZFSTKFTVB 180 

TASREYKKMF KiQTWMDNMGR AOEMELKPPN GEDYTCITPQ PDLSKPKMQS LDKDIVALMV 240 

RRAYDIAGST KDVKVFWIGN KLPVKGFRSY VDMYLKDKLD ETGEISLKVIH BQVNHRWEVC 300 

LTMSERGFQQ ISFVNSIATS RGGRHVDYVA DQIVTKLVOV VKRKNKGGVA VKAHQVKXraN 360 

WZFVKALZEN FTFDSQTKEN MTLQPKSFGS TCQLSEKFZK AAIGOGZVES ZliNWVXPKAQ 420 

VQUIKRCSAV KBNRIKGZPK LDDANDAGGR HSTBCTIiZLT EGDSAXTLAV SGLGWGRDK 480 

YGVFPLRGKI LKVREASHKQ ZMBNAEIMNI IKIVGLQYKK NYEDEDSLKT LRYGKIMIMT 540 

DQDQDGSEIK GLLXNFZHHN WPSXiLR^RFL EEFITPIVKV SKNKQEMAFY SLPEFEBmCS 600 

STPNHKKWKV KYYKGLGTST SKEAKEYFAD MERRRIQFKY S6PEDDAAIS LAPSKKQIDD 660 

RKEHLTNFKB DRSQRKLLGX. PEDYLYOQTT TYLTYMDFIN KEliILFSNSD NER5IPSKVD 720 

6LKPGQRKVL FTCFKRNDKR EVKVAQIA6S VAEMSSYHHG EMSLMMTIZM LAQlilFVGSNN 760 

unJiQPIGQF GTRLHGGKDS ASPRYZFTML SSLARLLFPP KDDBTLKFLY DI3NQRVBPEW 840 

YIPIIPKVLI NGABGIGTGW SCKZPKFDVR EIVNNIRRLM DGEEPLPMLP SYKKFKGTIB 900 

ELAPHQYVZS GEVAZLNSTT lEZSELPVRT WTQTYKBQVL EPMUOGTBICT PPLITDYREY 960 

EnyTTVKFW KKTEEKXAEA BRVGtaiCVFK LQTSLTCNSH VLFDHVCCUC IKDTVLDZLR 1020 

DFFELRLKYY GLRKEWLL6M LGABSAKLNM QARFILEKID GKZZZENKPK KELZXVLZQR 1080 

GYDSDPVKAW KEAQQKVPDE EQIEESDNER ETEK5DSVTD SGPTFNYLU> MPLWYLTKSK 1140 

KDELCRLRNE KEQHLDTLKR KSPSDLWKED LATFIEELBA VEAKEKQDBQ V6LPGK6GKA 1200 

KGKKTQMABV LPSPRGQRVI PRZTZEMKAE AEEOQnCICKZK NEKTB6SPQB DGVELEGLXQ 1260 

RLBRXQKRBP GncrKKQTTL AFKPZKKGKK RMPWPDSESD R5SDB8NFDV ^RETEPRRA 1320 

ATKTKFTKDL DSDEDFSDFS ERIIffiSDFVP SDA8PPKTKT SPKLSNKEUC PQXSWSDLB 13 BO 

ADDVKCSVPL 8SSPPATRFP OETBtTNPVP KKNVTVKKTA AKSQSSTSTT QAKKRAAPRO 1440 

TKRDPALNSG VSQKPDPAKT KZIRRKRKPST SDDSDSSXFBK. IVSKAVTSKK SKGESDOFBM 1500 
DFDSAVAPRA KSVRAKKPIK YLEBSDEDDIi F 



Seq ZD NOt 179 VSA sequence 

Nucleic Acid Accession S: Bos sequence 

Coding sequence: 148-7095 

1 11 21 31 41 51 

I t i I I I 

CACACATACG CAC6CACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 
CAAAAAAAAC ATTTCCTTOG CTCCCCCTCC CTCTCCACrC TGAGAA6CA6 AGGAGCC6CA 120 
C3G6C6AGGGG COQCAGAOGQ TCTQGAAATO OQAATOCTAA A6CSTTTCCT C3GCTTGCATT 180 
aUXTCCTCT GTGTTTGOOO CCTOGATTGG 6CTAAT6GAT ACTACA6ACA ACASAGAAAA 240 
CTTCTTGAAG AGATTGGCTG GTOCTATACA GGAGCACTGA ATCAAAAAAA TTGGQQAAAG 300 
AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 
CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAOGGTT GGGATAAAAC ATCATT6GAA 420 
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AAa^CATTCft ttcmmcm: 

6TCA60G6AG 6AGTTTGAGA 
AAATGCAATA TOTCATCTGA 
GAGATGCAAA TCTACTGCTT 
GGAAAAGGGA AGTTAAGAGC 
GMTTCUAG OGATTATTGA 
TTAGATCCAT TCATACTGTT 
AATGGCTCAT TGACATCTCC 
ACAGTTAGCA TCTCTGAAAG 
TCTGGTTAT6 TCATGCTGAT 
TTCTCTAGAC AOGTGTTTTC 
AGTTCA6AAC CAGAAAATGT 
TGGGAAAtSAC CTOSAGTOGT 
CA8TTGGATG GAGAOSACCA 
GGTGCTArrC TCAATAATTT 
TGCACTAATG CjCTTATATGG 
AATCCTGAAC TTGATCTTTT 
GAAGAGGGAA AAGACATTGA 
AACCAAATCA GGAAAAASGA 
AOIAAATACA ATGAAGCCAA 
AAQGGTGATG TTCXXIAATAC 
ACA6AAAAA6 ATATTTCCTT 
GAAGGTACTT CAGCCTCTTT 
AACTTGTOGG GG ACTGCA GA 
AGTTTATTGA CCAGTTTCAA 
GCAACTTCTG CTATCCCATT 
GAAAACCCAG A6ACAATAAC 
GAAGATTCftA CTTCATCA66 
OTGTGGTTTC CTAQCTCTAC 
AGCTTTCTCC AGACTAATTA 
TCCTTTTCTQ CAGGCCCAGT 
CATTATTCTA CCTTTGCCTA 
TGCAfihCAAC AOGATTTGGT 
6TATACAATG GT6AGACACC 
ACCCCTTTGT TGCTTGACAA 
TC6GCCTTGC ATGCTAG6CC 
TCTTCCTATG ATGGTGCACC 
TTTCOCCATC TQCATACAGT 
6ATAA6GTGC CCTT6CAT6C 
AGCCTTGCTC AGTATTCTGA 
TTTGGTAGTG AATCTGGTGT 
AGCAGTGATG CCAT6ATGCA 
QATA ATOAQ Q GCTCCCAACA 
6ATTCT6TG0 GTGTAACTTA 
CCTAAGTCTT C3GTTAATAAC 
GGTGATGGGG AATGGTCTGG 
GGGCTGACAG CCCTTAACAT 
TCTGTOTTTO GTGATGATAA 
ACTGAACTGC AAATTCCTTC 
CCCAAC3VTGT ATGATAATGT 
ATTTCTAGCA CCAAGGGCAT 
GATCAT6AGA TTAGTCAAGT 
TCTCAAGCAT CTGGTSACAC 
TCCTCIGACC CTGCTTCTAO 
ACCTCAC5CTT CTTTTAGTAC 
GACACCTTGC TTAAAACTGT 
CXXIAAAGTTG ATAAAATTAG 
AGTGAAAACA TGCT6CACTC 
ATOCACTCTQ CTTCACTTCA 
GTTTTGTTAA AAA6TGAAAG 
TTGTTCCAAA CGGCCAATTT 
TTTGCTACAC CTGnTTATC 
CATTCOGATG AAATTTTAAC 
ATTCCAACAG TTGCTTCT6A 
GQGCATGTTG CCATTACAGC 
TTGCTGTTTC CTTCTAAGGC 
TTAGTGGGTG GTGGTGAAGA 
AGAG6TAGTQ AT6GCTTATC 
CAGGAAAAGO TAATGAAT6A 
CCAATCTCAT ACTC3VCTATC 
TCAGACAGTC AAACTGGTAT 
TCCCAAAAGC ACAATGATG6 
CCTCTGAGCC CXOAA3CTAA 
GGGCAAG6TA CCTCAGATAG 
GACACTAATG AAAAAGATCC 
GQATTCCCAC AGTCCCCAAC 
TCA6AGGCAG AGGCCAGTAA 
GAATOCGAGA A6AAGGCAGT 
CTAGTGGTTC TTGTGG G T A T 
TACTTABAQG ACAGTACATC 
ATTTCAGATO ATGTOGQAGC 
CATGCAAGTA GTGQGTTTAC 
CAGAGCTGTA CTGTTGACTT 
CACAAGAATC GATACATAAA 
CiTGCTGAAA AQQATQOCAA 
AACAGACOU^ AAGCTTATAT 
TG6AGAAT6A TAIGGGAACA 



TGGGAAAACA GTGGAAATXA ATCTCACTAA TGACIACSCGT 480 

AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

TGGATCAGAG GATAGTTTAG AAG6ACAAAA ATTTCCACTT 600 

TGATGCGGAC CGATTTTCAA GTTTTGAQGA AGCA6TCAAA 660 

TTTATCCATT TTGTTTGA66 TTGGGACA6A AGAAAATTTG 720 

TGGASTGGAA AGXGTTAGTC GTTTTGGGAA GCAGGCTGCT 760 

GAACCTTCTO GCAAACTCAA CTGAOUUSTA TTACATTTAC 640 

TCCXTTGCACA GACACAGTTG ACTGGATT6T TTTTAAAGAT 900 

CCAGTTGGCT GTl ' rrrrG TG AAGTTCrrAC AATGCAACAA 960 

GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

CTCATACACr GGAAAGGAAS AGATTCATGft AGCAOTTTGr 1060 

TCAGGCTGAC CCAGAGAATT ATA0CA6CCT 7CTTGTTACA 1140 

TTATGATACC ATGATTGA6A AGTTTGCAGT TTTGTACCAG 1200 

AACCAAGCAT GAATTTTT6A CAGATGGCTA TCAAGACTTG 1260 

GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

AAAATACAGC GACCAACTSA TTGTOQACAT QOCTACTGAT 1380 

CCCTGAATTA ATTGGAACTO AAGAAATAAT CAAGGAGGAG 1440 

AGAAGGCGCr ATTGTGAATC CTG6TAGA6A CAGTGCTACA 1500 

ACCaaOATT TCTACCACAA CACACIACM TOGCATAGGG 1560 

GACTAACCGA TCOXAACAA GAGGAAGT6A ATTCTCTGGA 1620 

ATCTTTAAAT TCCACTTCOC AACCAGTCAC TAAATTAGCC 1680 

GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTOCAGTCCC 1920 

CATCTCTGAG AACATATCCC AAG66TATAT ATTTTCCTCC 1980 

ATATX3ATGTC CTTATACCAG AATCTGCTAG AAATGCTTCX: 2040 

TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

CACTQAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CTTCCCAACT GAGGTAACAC CTCATGCTTT TAOCCCATCC 2340 

CT0CACG6TC AAC35TG6TAT ACT06CAGAC AACCCAACCG 2400 

TCrrCAAOCT TCCTACAGTA GTGAAGTCTT TCCTCTAGTC 2460 

TCAGATCCrC AACACTACCC CTGCTGCTTC AAGTAGTGAT 2520 

TGTATTTCCC AGTGTCGATG TGTCATTTGA ATOCATCCTG 2580 

TTTGCTTCCA TTTTCCTCTG CTTCCTTCAG TAGTGAATTG 2640 

TTCTCAAATC CTTCCACAAO TTACXTCAfiC TACOGAQAGT 2700 

TTCTCTGCCA 6TGGCTOQG0 GTGATTTGCT ATTAGAGCCC 2760 

TGTGCTGTCC ACTACTCATQ CTGCTTCAGA GAOGCTGGAA 2820 

TCTTTATAAA ACGCTTATGT TTTCTCAAGT TGAACCACCC 2880 

TGCAC3GTTCT TCAGGGCCTO AACCTTCTTA TGCCTTGTCT 2940 

CATCTTCACT GTTTCTTACA GTTCTOCAAT ACCTGTGCA7 3000 

TCAGOGTTCC TTATTTA6C0 OCCCTAOCCA TATACCAATA 3060 

CCCAACPGCA TCATTACTGC AGCCTACTCA TGCCCTCTCT 3120 

AGCXrrCTTCT GATAGTGAAT TTCTTTTACX: TGACACAGAT 3180 

TTCTTCACXrr 6TTTCTGTA0 CT6AATTTAC ATATACAACA 3240 

TAAGG06CTT TCTAAAAGIG AAATAATATA TGQAAATQAG 3300 

TTTCAAT6A6 ATG6TTTA0C CTTCTQAAAO GACAOTCATO 3360 

AAATAAGTTG AATGCGTCTT TACAA6AAAC CTCTGTTTCC 3420 

GTTTCCAGGG TCCCTTOCTC ATACCACCAC TAAGGTTTTT 3480 

TCCAGAAAAT AACTTTTCAG TTCAACCTAC ACATACTGTC 3540 

TTOQCTTAAA CCT6TGCTTA GTGCAAACTC AGAGCCAGCA 3600 

TGAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATQAO 3660 

TGAAGTATTG CTACAACCTT CCTTTCAQGC TTCPGATGTT 3720 

TCTTCCAGCT GT6CCCAGTQ ATCCAATATT 0GTT6AAA0C 3780 

TTCTACAATG TTGCATCTCA TTGTATCAAA TTCTGCTTCA 3840 

TACKTCTGTA CCAOrTTTTQ AT6T6TG60C TACTTCTCAT 3900 

AOGTTTOACC ATTTCCTATO CAAGT6AGAA ATATQAACXa 3960 

TTCCCACCAA GTGGTACCTT CTTTGTACAG TAATGATGAO 4020 

GGAGATTAAC CAGGCCCATC CCCCAAAAGG AAGGCATGTA 4080 

AATTGATGAA CCATTAAATA CACXAATAAA TAAGCTTATA 4140 

CTCCACCAAA A6TTCT8TTA CTGGTAAOGT ATTTGCTGGT 4200 

TACATITGTA TCTACTOATC ATTCTGTTCC TATAGGAAAT 4260 

TGTTTCTCCC CACAGAGATG GTTCTGTAAC CTCAACAAAO 4320 

AACTTCTGAG CTGAGTCATA GTGCCAAATC TGATGCOGGT 4380 

TGGTGACACT GATGATOATQ GT6ATGATGA TGATGATGAC 4440 

CATTCATAAO TOTASGTGAT GCTCATOCTA TAOAGAATOi 4500 

TTCAGACACC CAOGAAAACA 6TCTTATG6A TCAGAATAAT 4560 

TGAGAATTCT GAAGAAGATA ATAGAGTCAC AAGTQTATCC 4620 

GGACAGAAGT CXn^GTAAAT CACCATCAGC AAATGGGCTA 4680 

AAAAGAG6AA AATGACATTC A6ACTGGTA0 TGCTCTGCTT 4740 

AOCATGGGGA GTTCTQACAA GT6AIQAAGA AA GTCGA TCA 4800 

CCTTAATGAG AAT6AGACTT CCACAGATTT CAGTTTTGCA 4860 

TGATGGQATC CTGGCAGCAG GTGACTCAGA AATAACTCCT 4920 

ATCATCTGTT ACTAGOGAGA ACTCAGAAGT GTTCCAOGTT 4980 

TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC T6AGGGGTTG 5040 

TATAOOCCIT GTGATOGTGT CAOCCCTGAC TTTTATCTGT 5100 

TCTCATCTAC TGGA6GAAAT GCTTCCAGAC TGCACACTTT 5160 

CCCTAGAGTT ATATCCACAC CTCCRACAOC TATCTTTOCA 5220 

AATTCCAAIA AAGCACTTTC CAAAGCATGT TQCAGATTTA 5280 

TX3AAGAATTT GAGACACTGA AA6A6TTTTA CCAGGAAGTG 5340 

AGGTATTACA GCAOACAGCT CCAACCA0C3C AGACAACAAG 5400 

TATCGTTGCC TAT<BITCATA 6CA0GGTTAA GCTAGCACA6 5460 

ACTGACTGAT TATATCAATG CCAATTATGT TGATGGCTAC 5520 

TCCTGCCCAA GGCOCACTGA AATCCACAGC TGAAGATTTC 5580 

TAATGT6GAA GTTATTGTCA TGATAACAAA CCT06T6GAG 5640 



256 



wo 02/086443 

AAAGGAAGGA GAAAATGTGA TCAGTACTGG CCTGCCaGATG GGAGTGAGGA GTACGGGAAC 5700 

TTTCTGGTCA CTCAGfiAGAG TGTGCAAGTG CTTGCCTATT ATACTSTGAG GAATTTTACT 5760 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAOGAA GACKXAGTCO AOQTGTGGTC 5820 

ACACAGTATC ACTACACX3CA GTGGOCTGAC ATGGGAiGTAC CAGA6TACIC CCTGCCAGTG 5880 

CTGACCTTTG TGAGAAAGGC AGCCTATOCC AAGCGCCATG CftGTGGGGCC TGTTGT06TC 5940 

CACTGCAGTG CTGGAC3TTGG AAGAACAGGC ACATATATTG TGCTAGACRG TATGTTGCAG 6000 

CACSATTCAAC ACGAAOGAAC TGTCAACATA TTTGGCTTCT TAAAACACAT CCGTTCACAA 6060 

AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAQ 6120 

GCCATACTTA GTAAAGAAAC TGAGGTGCTG GACABTCATA TTCATGCCTA TGTTAATGCA 6180 

CTOCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA GCTCCTGAGC 6240 

CAGTCAAAIA TACAOCAGAG TGACTATTCT GCAGCCCTAA AGCAATGCAA CAGOGAAAAG 6300 

AATOQAACTT CTTCTATCAT CCCTGTSGAA AGATCAAGGG TTGGCATTTC ATCCCTGAGT 6360 

GGAGAAGGCA CAGACTACAT CAATGCXTTCC TATATCATGG GCTATTACXai GAGCAATGAA 6420 

TTCATCATTA CCCAGCACCC TCTCCTTCAT ACCATCAAGG ATTTCTGGAQ GATGATATGQ 6480 

GACCATAATG CCCAACTGGT GGTTATGATT CXTOATGGCC AAAACATGGC AGAAGATGAA 6540 

TTTGTTTACT GOCCAAATAA AOATGAGCCT ATAAATTGTG AGAGCTTTAA GGTCACTCTT 6600 

ATGGCTGAAG AACACAAAT6 TCTATCTAAT GAG6AAAAAC TTATAATTCA GGACTTTATC 6660 

TTAGAAGCTA a«»GGATGA TTATGTACTT GAAGTGAGGC ACTTTCA6TG TCCTAAATGQ 6720 

CCAAATCCAG ATAGCCCCAT TAGTAAAACT TTTGAACTTA TAAGT6TTAT AAAAGAA6AA 6780 

GCTGCCAATA GG6AT66GCC TATGATTGTT CAT6ATGA6C ATGGAGGAGT GACGGCAGGA 6840 

ACTTTCT6TG CTCTGACAAC CCTTATGCAC CAACTAGAAA AAGAAAATTC GGTGGATGTT 6900 

TACCaGGTAG CCAAGATGAT CAATCT6ATG A6GCX2AGGAG TCTTT6CTGA CATTSAGCR6 6960 

TATCAGTTTC TCTACAAAGT GATCCTCAGC CTTGTGAGCA CAAGGCAGGA AGAGAATCCA 7020 

TCCACCTCTC TGGACAGTAA TGGTGCAGCA TTGCCTGATG GAAATATAGC TGAGAGCTTA 7080 

GAGTCTTTAG TTTAACACAG AAAGGGGTGO GGGGACTCAC ATCTGAGCAT TGTTTTCCTC 7140 

TTOCTAAAAT TAGGCAGQAA AATCAGTCTA OITCTGTTAT CTGTTGATTT CCCATCACCT 7200 

GACAGTAACT TTCATGACAT AGGATTCTGC OGCCAAATTT ATATCATTAA C AATGTC TGC 7260 

CTTTTTGCAA GACTTGTAAT TTACTTATTA TGTTTGAACT AAAATGATTG AATTTTACAG 7320 

TATTTCTAAG AATGGAATTG TGGTATTTTT TTCTGTATTG ATTTTAACAG AAAATTTCAA 7380 

TTTATAGAGG TTAGGAATTC CAAACTACAG AAAATGTTTG TTTTTAGTGT CAAATTTTTA 7440 

GCTGTATTT6 TAGCAATTAT CAGGTTTGCT AGAAATATAA CTTTTAATAC AGTAGCCTCT 7500 

AAATAAAACA CTCTTCCATA TGATATTCAA CATTTTACAA CTGCAGTATT CACCTAAAGT 7560 

AGAAATAATC TGTTACTTAT TGTAAATACT GCCCTAGTGT CTCCATGGAC CAAATTTATA 7620 

nTATAATTG TAGATTTTTA TATTTTACTA CTGAGTCAAG TTTTCTAGTT CTGTGTAATT 7680 

G7TTAGTTTA ATGAC6TAGT TCATTAGCTG GTCTTACTCT ACCAGTTTTC TGACATTGTA 7740 

TTGTGTTACC TAAGTCATTA ACTTTGTTTC AGCAT6TAAT TTTAACTTTT GT6SAAAATA 7B00 

GAAATACCTT CATTTTGAAA 6AAGTTTTTA T6ASAATAAC ACCTTACCAA ACATTGTTCA 7860 

AATGGTTTTT ATCCAAGGAA 7TGCAAAAAT AAATATAAAT ATTGCCATTA AAAAAAAAAA 7920 
AAAAAAAAAA AAAAAAAAAA AAAA 

Seq ZD NOi 180 Protein sequence: 
Protein Accession 8: Eos sequence 

1 11 21 31 41 51 

I 1 I 1 I I 

KRILKHFIAC ZQLIiO^CRLD MANGyYRQQR KLVBBIGWSY TGALNQXNHG XtOrPTCKSPR 60 

QSPZNIDBDL TQVNVKLKKL KFQGHDKTSL ENTFZHNTGK TVEINZiTNDY RVSG6VSEMV 120 

FKASKITFHW GKCKMSSDGS EHSLEGQKFP LEKQXYCFDA DRFSSFEEAV K6KGKLRALS 180 

ILFEVGTEEN U>FKAIIDGV BSVSRFGKQA ALDPFILLNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVPK DTVSISESQL AVFCEVLTMQ QS6yVMIJ«©y LQNNFREQQY KFSRQVPSSY 300 

TGKEBIHEAV CSSEPEHVQA DPEHYTSLLV THERPRVWD TKXEKFAVLY QQLDQSDQTK 360 

HBFLTDGYQD LGAILNNZiLP NM5YVLQZVA ZCTNGLYGKy SDQLZVDMPT ONPELDLFPB 420 

LIGTBEIIKB EEB6KD1BEG AZVNPGRDSA TNQZRKKEPQ ISTTTHYNRZ GTKYNSAKTN 480 

RSPTRGSBPS GKGDVPNTSL NSTSQPVTKL ATEKDXSLTS QTVTELPPHT VEGTSASU© 540 

GSKTVLR5PH KHLSGTAESL NTVSZTBYEB ESLLTSFKLD TGAEDSSGSS PATSAZPFXS 600 

ENISQQyiPS SEKPETZTYD VLZPESASNA SEDSTSSG5B ESLKDPSHEG NVWPPSS1DI 660 

TAQPDVGSGR ESPLQTHYTE ZRVDESEKTT KSFSAGPVMS QGPSVTDLEM PHYSTFAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNGETPIjQ PSYSSEVFPL VTPIiLLDNQI 780 

UJTTPAASSS DSALHATPVF PSVDVSPESZ LSSYDGAPLL PFSSASFSSE hFSBLBTJBQ 840 

ZLPQVTSATB SDKVPIflASL PVAGG0LLLE PSIAQYSDVL STTHAASETL EPGSESGVLY 900 

KTLMFSQVEP PSSDAMMHAR SSGPBPSYAL SDNEGSQHIF TVSYSSAZPV HDSVGVTYQG 960 

SLPSGPSHIP IPKSSLXTPT ASI^^PTHAL SGDGEUSGAS SDSEFLLPXTT DGLTAIiilSS 1020 

PVSVABFTYT TSVFGDONKA LSKSBZXY6K ETELQIPSFN EMVYPSESTV MFNKYBNVNK 1080 

LNASLQETSV SZSSTKGMFP GSIAHTTTKV EDHEZSQVPE NNPSVQPTHT VSQASGDTSL 1140 

KPVLSAKSEP ASSDPASSEM I*SPSTQLIiPY ETSASFSTEV liQPSFQASD VDTLLKTVLP 1200 

AVPSDPILVB TPKVDKISST MLHLIVSNSA SSEKMI£STS VPVFDVSPTS HMHSASLQGL 1260 

TZSYASEKYE PVLLKSBSSH QWPSLYSND ELFQTANLSZ KQAHPPKGRH VFATFVLSIP 1320 

^imUUm, IHSDBILTST KSSVTGKVFA GIPTVASDTF V5TDRSVPX6 HGKVAITAVS 1380 

PERDGSVTST KLLFPSECATS EL5HSAKSDA 6LVGG6EDGD TDDOGDDDDD ORGSDGLSIB 1440 

KCMSCSSYRE SQEKVMNDSD TRENSLMDQN NPZSYSLSEN SEEDNRVTSV SSDSQT6MDR 1500 

SPGKSPSAHG LSQKHHDaKE EKDIQTG5AL LPLSPBSKAW AVLTSDEESG SGOGTSDSLET 1560 

ESiSTSTDFSF ADTNEKDAXX3 XLAAGDSSXT PGFPQSPTSS VTSENSEVFH VSEAEASNSS 1620 

HBSRZSLABG LBSEKKAVIP LVIVSALTPI CLVVI»V6ZIiI YHRKCFOTAR FYLSDSTSPR 1680 

VXSTPPTPIF PISDDVGAIP ZXBFPKBVAD IiHASSGFTEB FETIjKBFYQB VOSCTVDLGI 1740 

TADSSMHPDN KEKNRYZNIV AYZSBSRVKLA QLAEKD6KLT DYZKANYVDG YNRPRAYZAA 1800 

QGPUCSTAED FWRMtWEHNV EVXVMZTNLV EKGSRKC33Qy MPADGSEEYG OTLVTQKSVQ 1860 

VLAYYTVRNF TLRNTXIKXG SQKGRPSGRV VTQYEYTQIMP DMGVPEYSLP VLTFVRKAAY 1920 

AKRHAVQPW VBCSASVQRT 6TYZVLDSML QQZORBSTVH ZFGFLKBIRS QRNYliVQTEB 1980 

QYVFIICmAr EAZLSKETEV U3SEZRAYVII ALIiXPGPASK TKLEKQFQUi SQSHIQQSDY 2040 

SAALKQCNRE KKRTSSZXPV ERSRVGISSL SGEGTDYXMA SYZMGYYQSK BFZZTQHPLL 2100 

HTZKDFWRMX WDHNAQLWM XPDGQNMAED KFVYWPNKDE PZNCESFKVT LMAEEHKCLS 2160 

NEEKLIIQDF ZLEATQDDYV bEVRHFQCPK WPNPDSPISK TFBLXSVZKB EAANRDGPMI 2220 

VEDEKOGVTA GTFCALTTIM BQZiBKaiSVD VYOVAKHZHL MRFGVFADZB QyQFLYICVZI< 2280 
SX.VSTRQSBN PSTSLDSNGA ALPDGNIABS Z.SSLV 

Seq ZD NO: 181 DNA sequence 

Nucleic Acid Accession St Eos sequence 
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5 
10 
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OVCACATAOS CA06CAC6A7 CTOVCTTOGA TCTATAOVCT GGAGGATZAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTOG CTCCCCCTCC CTCTCC31CTC TGAGAAGCAG AGGAGCCXSCA 120 

CX3GOC3ACGGG COGCAGACOS TCTGGAAATG CX3AAT0CTAA AGOrrTTCCT OGCTTGCATT 180 

CAGCTCCTCT GTGTTT6C0Q CCTGGATrGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTT6TTQAAG AGATTGGCrO GTCCTATAGA OGAGCACTGA ATCAAAAAAA TTOGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTTATCA ATA7TGAT6A ASATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACftCATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACGGT 460 

GTCAGOGGAG GAGTTICAGA AATGGTGTTT AAAGCAAQCA AGAIAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGA6 CATAGTTTAG AA6GACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTOCTT TGAT6CGGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG OQATTATTGA TGGAGTOGAA AGTGTTAGTC GTTTTGGGAA 6CA0GCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCT6 CCAAACTCAA CTGACAAGTA TTACATTTAC 640 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAfiTTAGCA TCTCTGAAAG 0CAGTTG6CT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATO TCATGCTGAT GGACTACTTA GAAAACAATT TTOQAGAGCA ACAGTACAAG 1020 

TTCTCTA6AC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATX3A AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGA6AATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTOGAGTCGT TTATGATACC ATGATTQASA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAOGACCA AACCAAGGAT GAATTTTTQA CAGA7GGCTA TGAASACTTG 1260 

GQT6CTATTC TCAATAATTT 6CTACCCAAT ATGAGTTAT6 TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAAT6 GCTTATATGG AAAATACAGC GACCAACTGA TT6TCGACAT GCX:TACTGAT 1380 

AATCCT6AAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGC6CT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCA6ATT TCTACCACAA CACACTACAA TC6CATAGGG 1560 

AOGAAAIACA AT6AAGCCAA GACTAACOSA TCCXXAACAA GAGGAAGTXSA ATTCTCTGGA 1620 

AAOGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCACTCAC TAAATTA6CC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACT6CCACC TCRCACT6TG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG '1800 

AACTTGTOGG GGACTGCAGA ATCCTTAAAT ACA6TTTCTA TAACAGAATA TGAGGAGGAO 1860 

A6TTTATTGA CCAGTTTCAA OCTTGATACT GtSAGCTGAAO ATTCTTCAOS CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGA6 AACATATCC3C AAGG6TATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATQA3CTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTG6TTTC CTAGCTCTAC A6ACATAACA GCACAGCCOS ATGTT6GATC AG6CAGAGAG 2160 

AGCTTTCTOC AQACTAATTA CACTGASATA OaTGTTGATO AATCTGAOAA GACAACCAA6 2220 

TCCTTTTCTO CAGGCCCAGT GATGTCACAG GGTCOCTCAG TTACAGATCT GQAAATOCCA 2280 

CATTATTCTA CCTTTGCCTA CTTOCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG CAGAGGCCAG TAATAGTAGC CATGAGTCTC GTATTGGTCT AGCTGAGGGG 2460 

TTG6AATC0G AGAAOAAdSGC AGTTATACGC CTTOTGATGS TGTCAGCCCT GACTTTTATC 2520 

T6TC7A6TGO TTCTTGTGGG TATTCTGATC TACTGGAOGA AATGCTTCCA 6ACT6CACAC 2580 

TTTTACTTAG AGGACAGTAC ATCCCCTAGA GTTATATCCA CACCTCCAAC ACCTATCTTT 2640 

CCAATTTCAG ATGATGTCGG AOCAATTCCA ATAAAGCACT TTCCaAAGCA TGTTGCAGAT 2700 

TTACATGCAA GTA6TGG6TT TACT6AAGAA TTTGA6ACAC T6AAAGAGTT TTACCAGGAA 2760 

GTGCAGAGCT GTACT6TTGA CTXAG6TATT ACAGCAOACA OCTGCAACCA OOCAGACAAC 2820 

AAGCACAAGA ATOGATACAT AAATATOGTT GCXrTATGATC ATAGCAGGGT TAAGCTAGCA 2880 

CAGCTTGCTG AAAAGGATGO CAAACTGACT GATTATATCA ATGCCAATTA TGTTGATGGC 2940 

TACAACAGAC CAAAAGCTTA TATTGCTGCC CAAGGCCCAC TGAAATCCAC AGCTGAAGAT 3000 

TTCT6GAGAA TGATATGGGA ACATAATGTG GAAGTTATlt3 TCATGATAAC AAACCTCGTG 3060 

GAGAAAGGAA GGAGAAAAT6 TGATCAGTAC TGGCCTGCXX; A1X3GGAGTGA GGAGTACGGG 3120 

AACTTTCT6G TCACTCAGAA GAGTGTGCAA 6TGCTTGCCT ATTATACTGT GAGGAATTTT 3180 

ACTCTAAGAA ACACAAAAAT AAAAAAGGGC TCCCA6AAAQ GAAGACCCAG TGGAOGTGTG 3240 

GTCACACAGT ATCACTACAC GCAGTGGCCT GACATGGGAG TACCAGAGTA CTCCCTGCCA 3300 

GT6CTGACCT TTOTGAGAAA GGCAGCCTAT GCCAAG06CC ATGCAGTGGG GCCTGTTGTC 3360 

GTCCACTGCA 6TGCTGGAGT TGGAAGAACA 6GCACATATA TTGT6CTA6A CAGTATGTT6 3420 

CAGCAGATTC AACACXSAAGG AACTGTCAAC ATATTTGGCT TCTTAAAACA CATCCGTTCA 3480 

CAAAGAAATT ATTTGGTACA AACTGAGGAO CAATATGTCT TCATTCAT6A TACaCTGGTT 3540 

GAGGCCATAC TTAGTAAAGA AACTGAGGTQ CTGGACAGTC ATATTCAOXSC CTATGTTAAT 3600 

6CACTCCTCA TTCCTGGACC AGCAG6CAAA ACAAA6CTA6 AGAAACAATT 0CAGCTCCT6 3660 

AOCCAGTCAA ATATACAGCA GAGTGACTAT TCT6CAGO0C TAAAGCAATG GAAGAGGGAA 3720 

AAGAATCGAA CTTCTTCTAT CATCCCTGTG GAAAGATCAA GGGTTOGCAT TTCATCCCTG 37 BO 

AGTGGAGAAG GCACAGACTA CATCAATGCC TCCTATATCA TGG6CTATTA CC31SAGCAAT 3840 

GAATTCATCA TTACCCAGCA CCCTCTCCTT CATACCATCA AG6ATTTCT0 GAGGATGATA 3900 

TGGGACCATA ATGCCCAACT GGTGGTTATG ATTCCTGATO GCCAAAACAT GGCAGAAGAT 3 960 

GAATTTGTTT ACTGGCCAAA TAAAGATGAG CCTATAAATT GTGA6AGCTT TAAGGTCACT 4020 

CTTATGGCTG AAGAACACAA ATGTCTATCT AATGAGGAAA AACTTATAAT TCAGGACTT7 4080 

ATCTTAGAAG CTACACAGGA TGATTATGTA CTT6AA6T6A GGCACTTTCA GIGTCCTAAA 4140 

TGGCCAAATC CAGATAGCCC CATTAGTAAA ACTTTTQAAC TTATAAGTGT TATAAAAGAA 4200 

GAAGCTGCCA ATAGGGATC^ GCCTATGATT GTTCATGATG AGCATGGAGG AGT6A06GCA 4260 

OGAACTTTCT GTGCTCTGAC AACCCTTATQ CACCAACTAG AAAAAGAAAA TTCCGTGGAT 4320 

GTTTACCAGG TAGCCAAGAT GATCAATCTG ATGAGGCCA6 GA6TCTTTGC TGACATXtSAG 4380 

CAGTATCAGT TTCTCTACAA AGTGATOCTC AGCCTTGTGA GCACAA6GCA GGAAQAGAAT 4440 

CCATCCACCT CTCTGGACAO TAATQGTGCA GCATTGCCTG ATGGAAATAT AGCTGAGA6C 4500 

rrAQAGTCTT TA6TTTAACA CAQAAAGGGG TGGGGGGACT CACATCTGAG CATTGTTTTC 4560 

CTCTTCCTAA AATTAGGCAO GAAAATCAGT CTAGTTCTGT TATCTGTTGA TTTCCCATCA 4620 

CCTGACAGTA ACTTTCATGA CATAGGATTC TGC06CCAAA TTTASATCAT TAACAATGTG 4680 

TGCSCTTTTTG CAAGACTT6T AATTTACTTA TTATGTTT6A ACTAAAATGA TTGAATTTTA 4740 

CAGTATTTCT AAGAATCGAA TTGT6GTATT TTTTTCTGTA TT GATTTTA A CAGAAAATTT 4800 

CAATTTATAG AGGTTAGGAA TTCCAAACTA CAGAAAATGT TTGTTTTTAO TGTCAAATTT 4860 

TTAGCTGTAT TTGTAGCAAT TATCAGGTTT GCTAiSAAATA TAACTTTTAA TACAGTAGCC 4920 

TGTAAATAAA ACACTCTTCC ATATGATATT CAACATTTTA CAACT6CAGT ATTCACCTAA 4980 
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AQTAQAAATA ATCTGTTACT TATTGTAAAT ACTGCCCTAO TGTCTCCATG GAOCAAATTT 5040 

ATATTTATAA TT6TAGATTT TTATATTTTA CTACTGASTC AAGTTTTCTA 6TTCTGTGTA 5100 

ATTGTTTAGT TTAATGACGT AGTTCATTAG CTGGTCTTAC TCTACCAGTT TTCTGACATT 5160 

OTA'riirmTr acctaagtca ttaactttgt ttcagcatgt aattttaact tttgtggaaa 5220 

ATAGAAATAC CTTCATTrTG AAAGAAGTTT TTATGAGAAT AACACCTTAC CAAACATTGT 5280 

TCAAATGGTT TTTATGCAAQ CSAATTGCAAA AATAAAZAXA AATATTGCCA TTAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAAAAAA 



Seg ID NO: 182 Protein sequence: 
Protein Accession 8: Eos sequence 

1 II 21 31 41 51 

I I i I 1 [ 

MRIXiKRPLAC IQIiLCVCRIiD WAHGYYRQQR KZiVEEIGHSY TGALNQKNWG KKYPTQISPR 60 

QSPIMIDEDL TQVNVNLKKL KFQGWDKTSL QITFIBNT6K TVEINZiTNDY RVSGGVSEMV 120 

FKASKITFHH GKCSIMSSDGS EHSLBGQXFP I£KQIYCFDA DRFSSFEEAV K6KGKLRALS 180 

ILFBVGTEEN LOFKAIIDGV BSV5RFGXQA ALDPFILLNL LPMSTDKYYI YK6SLTSFPC 240 

TDTVDWrVPK DTVSISESOl. AVPCEVLTMQ QSGYVMUOY LQNNFREQQY KFSRQVFSSY 300 

T6KEEZBEAV CSSEPEHVQA DPENYTBIiLV THERPRWYD TMIEKFAVLY QQLDGBDQTK 360 

KEFLTDGYQD LGAIUIMLLP NMSYVLQIVA ICTN6LY6KY SDQLZVDMPT DKPBU3I.FFB 420 

LX6TEEIIKE EEB6XDXBE6 AIVNPGRDSA TNQIRKKEPQ ISTTTBYmX GTKmEAKTN 480 

R5PTRGSEFS GKSDVFNTSL NSTSQPVTKL ATSKDISLTS QTVTELPPRT VEGTSASUID 540 

GSKTVLRSPH MNLSGTAE5L NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENZSQGYZFS SENPETITYD VltZPBSARNA SEDSTSSGSE ESLKDPSMEG NVHFPSSTDI 660 

TAQPDVGSGR BSFLQmrTB IRVDESEKTT KSFSA6PVMS QGPSVTDLEM PEYSTFAYFP 720 

TEVTPHAFTP S5RQQDLVST VNWYSQTTQ FVYNAEASNS SHESRIGLAE GLESEKKAVI 780 

PLVrVSALTF ICLWLVGIL lYWRKCFQTA HFYLEDSTSP RVISTPPTPI FPISDDVGAI 840 

PIKHPFKHVA DUiASSGFTE EFETLKEFYQ EVQSCTVDLG ITADSSNHPD KKHXNRYINI 900 

VAYDHSRVKlt AQIAEKDGKL TDYZKANYVD GYNRPKAYIA AQGPLKSTAB DFWRMIHEHN 960 

VEVIVMZ7NL VEKGRRKCDQ YWPADGSEEY GNFLVTQK5V QVLAYYTVRK FTLRNTRIKK 1020 

GSQK6RPSGR WTQYKYTQW PDMGVFEYSL PVLTFVRKAA YAKRKAVGFV WHCSAGVGR 1080 

TGTYIVLDSM LQQIQHBGTV HIFGFLKEIR SQRNYLVQTB EQYVFIHDTL VEAILSKETE 1140 

VLDSHIHAYV NALLIPGPAG KTKLBKQFQL LSQSNIQQSD YSAALKQOJR EKNRTSSIIP 1200 

VERSRVGISS LSGBGTDYIN ASYIM6YYQS NSFIITQHPL LHTIKDFWHM IWDHNAQLW 1260 

MIPDGQNMAE DEFVYWPNKD EPINCESFKV TLMAEEHKCL SNSEKLIIQD FILEATQDDY 1320 

VLEVRHFQCP fCWPNPDSPIS KTFELZSVIK EEAAHRD6PM ZVHDEH6GVT AGTFCALTTL 1380 

MHQLEKENSV DVYQVAKMIH LMRFGVFADZ EQYQFLYKVZ LSLVSTRQEE KPSTSLDSNG 1440 
AAXiPDQfZAB 8LE8LV 

Seq ZD NO: 183 DNA sequence 

Nucleic Acid Accession #: EOS sequence 

Coding sequences 148-4494 

1 11 21 31 41 51 

I i i I t I 

CACACATAOO CAOGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTC6 CTCCCOCTCC CTCTOCACTC T6AGAAGCA0 AGOAOCCGCA 120 

0GGCGAGGG6 CCSCA6AC0G TCTGGAAATG 06AATCCTAA AO O STTTCCT 06CTTGCATT 180 

CAGCTCCTCT G lxnViXX'CU CCTGGATTGQ GCTAATCGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATT6AT6A A6ATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT OGOATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACOGT 480 

GTCAG06GA6 GAGTTTCAGA AATGGTGTTT AAAGCAA601 ASATAACTTT TCACT0GG6A 540 

AAATGCAATA TGTCATCTQA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT T6ATGCAGAC OGATTTTCAA aTTTTGAGGA AGCAGTCAAA 660 

66AAAA0GGA AGTTAAGAGC TTTATCCATT TT6TTT6AG6 TT006ACAGA AGAAAATTTG 720 

6ATTTCAAAG C6ATTATT6A T6GAGT0GAA AGT6TTA6TG GTTTTGGGAA GCAGGCTGCT 780 

rrAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTQACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGC3VCA GACAC3W3TTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTA6CA TCTCTGAAAG OCAGTTGGCT GTTTTrTGTG AAGTTCTTAC AATX3CAACAA 960 

TCZGQTIATO TCATOCTGAT GGACTACTTA CSkAAACAATT TTOSAGAGCA ACAGIACAA8 1020 

TTCTCTAGAC A6GTGTTTTC CTCATACACT GGAAA66AAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCC7 TCTT6TTACA 1140 

TGGGAAAGAC CTCGA6T0GT TTATGATACC ATQATTGAQA AGTTTGCAGT TTTGTACCAG 1200 

CA6TTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

aGTGCTATTC TCAATAATTT QCTACCCAAT ATGAGTTAT6 TTCTTCAGAT ASTAGCCATA 1320 

T6CACTAAT6 GCTTATAT6G AAAATACAGC GACCAACTGA TT6T08ACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGOGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA G6AAAAA6GA ACCCCAGATT TCTACCACAA CACACTACAA TC6CATAGGG 1560 

A08AAATACA AT6AAGCCAA GACTAAGCGA TCCCCAACAA QAOGAAGTGA ATTCTCTGGA 1620 

AA6QGTQAT6 TTGCCAATAC ATCTTTAAAT TGCACTTCXX AAOCAGTCAC TAAATTAGCC 1680 

ACAGAAAAA6 ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCtCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGT066 GGACTGCAGA ATCCTTAAAT ACAQTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCriGATACT GGAGCTGAAG A TTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTSAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

6AAAACCXAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATQCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAA6AA TCACTAAAQG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC A6ACATAACA GCACAGOCOG AT6TTGGATC AGGCAGAGAO 2160 

A6CTTTCTCC A6ACTAATTA CACTGAGATA 0GTGTT6ATG AATCTGAOAA 6ACAAGCAAG 2220 

T0CTTTTCT6 CAG6CCCA6T 6AT6TCACA6 GGTCCCTCAG TTACA6ATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAAC3«: CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGQATTTGGT CTCCACGGTC AAOGTGGTAT ACTCGCAGAC AACCCAACXX5 2400 

6TATACAATG AGGCCAGTAA TAGTAGCCAT GA6TCT0GTA TTGGTCTASC TGA66GGTTG 2460 
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GAATC0QU3A ASMG6CA6T TATACOCCTT GTGATOGKOT CAGCCCI6AC ■ nyiA T C TGT 2530 

CIAGTCeTTC TTGTGGOTAT TCTOVTCIAC TGGAGGAAAT GCTTCCASAC TGCACACTTT 2580 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTTCAGATG ATGT0GGAC3C AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 2700 

GA7GCAAGTA GTGGGTTTAC TGAAGAATTT GAGGAA6T6C ASAGCTGTAC TGTTGACTTA 2760 

OGTATTACAG CAGACAGCIC CAACCAOCCA GACAACAAGC ACAAGAATOG AXAOlTAAAT 2820 

ATOGTTGCCr ATGATCATAG CAG66TTAAG CTA6CACA6C TT6CTGAAAA GGKIGGCAAA 2880 

CTGACTGATT ATATCAATGC CAATTATCTT GATGGCTACA ACAGACCAAA AGCTTATATT 2940 

GCTGCCC3\AG GCXXIACTGAA ATCCACAGCT GAAGATTTCT GGAGAATGAT ATGGGAACAT 3000 

AATGTG6AA0 TTATIGTCAT GATAACAAAC CTC6TGGAGA AAOGAAOGAG AAAATGTGAT 3060 

CA6TACIGGC CT6C0GATGG GAGTGAGGAG TAOGGQACT TTCTG G TCAC TCAGAAGAGT 3120 

GTGCAAGTGC TTGCCTATTA TACTGTGAG6 AATTTTACTC TAAGAAACAC AAAAATAAAA 3180 

AAGGGCTCCC AGAAAGGAAG ACCCAGTGGA OGTGT G GTCA CACAGTATCA CTACA06CAG 3240 

TGGCCTGACA TGGGAGTACC AGAGTACTCC CTGCCAGTSC TGACCTTTGT GAGAAAGGCA 3300 

GCCTATGCCA A6CXSCCATGC AGTGG6GCCX GTTGTOGTCC ACTGCAGTGC TGGAGTTGGA 3360 

AGAACAGGCA CATATATTGT 6CTAGACAOT ATGTTGCAGC AGATTCAACA OGAAGGAACT 3420 

GTCAACATAT ' nXjG L T X'Cri' AAAACACATC CGTTCACAAA GAAATTATTT GGTACAAACT 3480 

GAGGA6CAAT ATXrrCTTCAT TCATGATACA CTGGTTGAG6 CCATACTTA6 TAAAGAAACT 3540 

GAGGTGCTGG ACA6TCATAT TCATGCCTAT GTTAATGCAC TCCTCATTCC TGGACCAGCA 3600 

GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTCCTGAOCC AGTCAAATAT ACAGCAGAGT 3660 

GACTATTCTG CAGCCCTAAA GCAATOCAAC AGGGAAAAGA ATCGAACTTC TTCTATCATC 3720 

CCTCTGGAAA QATCAAGGGT TGGCATTTCA TCCCTGAGTG GAGAAGGCAC AGACTACATC 3780 

AATGCCTCCT ATATCATGGG CTATTACCAG AGCAATGAAT TCATCATTAC CCAGCACCCT 3840 

CTCCTTCATA CCATCAAGGA TTTCTGGAGG ATGATATGGQ ACCATAATGC CCAACTQGTG 3900 

GTTATGATTC CTGATGGCCA AAACATQGCA GAAGATGAAT rTGTTTACTG GCCAAATAAA 3960 

GATGAGCCTA TAAATTGTGA QAGCTTTAAG GTCACTCTTA TGGCTGAAGA ACACAAATGT 4020 

CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT TAGAAGCTAC ACAGGATGAT 4080 

TATGTACTTO AA0TGAGGCA CTTTCAGTGT CCTAAATGGC CAAATCCA6A TAGCCCCATT 4140 

AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAGAAGAAG CTGCCAATAG GGATGGGCCT 4200 

ATGATTGTTC ATGATGAGCA TGGAGGAGTG ACGGCAGGAA CTTTCTGT6C TCTGACAACC 4260 

CTTATGCACC AACTAGAAAA AGAAAATTCC 6TGGATGTTT ACXAGGTAGC CAAGAT6ATC 4320 

AATCTGATGA GGCCAG6AGT CTTTGCT G AC ATTGAGCAGT ATCAGTTTCT CTACAAAGTG 4380 

ATCCTCAGOC TTGTOA6CAC AAGGC3U3GAA GAGAATOCAT CCACCTCTCT GGACAGTAAT 4440 

GGT6CAGCAT TGCCTGATGG AAATATAGCT GAGAGCTTAG AGTCTTTAGT TTAACACAGA 4500 

AAGGGGTGGG 'GGGACTCACA TCTGAGCATT UTmXX'iVr TCCTAAAATT AGGCAGGAAA 4560 

ATCA6TCTAG TTCTGTTATC TGTTGATTTC CCATCACCTG ACAGTAACTT TCATGACATA 4620 

GGATTCTGCC GCCAAATTTA TATCATTAAC AATGTGTGOC TTTTTGCAAG ACTTGTAATT 4680 

TACTTATTAT GTTTGAACIA AAATGATTGA AXTTTACAGT ATTTCTAAGA ATGGAATTGT 4740 

GGTATTTTTT TCTGTATTGA TTTTAACAGA AAATTTCAAT TTATA6AGGT TAG6AATTCC 4800 

AAACTACAGA AAAT G ITT G T TTTTAGTGTC AAATTTTTAG CTGTATTTGT AGCAATTATC 4860 

AGGTTTGCTA GAAATATAAC TTTTAATACA GTAGOCTGTA AATAAAACAC TCTTCCATAT 4920 

GATATTCAAC ATTTTACAAC TGCAGTATTC ACCTAAAGTA GAAATAATCT GTTACTTATT 4980 

GTAAATACTO OCCTAOTSTC TCCATGQACC AAATTTATAT TTATAATTGT AGATTTTTAT 5040 

ATTTTACTAC TGA6TCAAGT TTTCTAGTTC TGTGTAATTG TTTA6TTTAA TGACGTA6TT 5100 

CATTAGCTGG TCTTACTCTA CCAGTTTTCT GACATTGTAT TGTGTTACCT AAGTCATTAA 5160 

CTTTGTTTCA GCATGTAATT TTAACTTTTG TGGAAAATAG AAATACCTTC ATTTTGAAAG 5220 

AAGTTTTTAT GAGAATAACA CCTTACCAAA CATTGTTCAA ATGGTTTTTA TCCAAGGAAT 5280 

TGCAAAAATA AATATAAATA TT6CCATTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 5340 



Seq ID NO: 164 Protein sequence: 
Protein Acceesion Si BOS sequence 

1 11 21 31 41 51 

! I I I t i 

MRIUCRFLAC IQLLCVCRLD WANGYYRQQR KLVBBIGWSY TGALNQKHWG KKYPTOISPK 60 

QSPIHIDEDL TQVNVNLKKL KPQGWDKTSIi ENTFIHKTGK TVEIULINDY RVSGGVSEMV 120 

FKASKITFHH GKCSmSSDGS EHSLEGQKFP LEKQIYCFDA DRFSSFEEAV RGRGKItRAIiS 180 

XLFBVGTSElf LDFKAIIDGV BSVSRFGRQA ALDPFXLUIL hPSSTDKIYl YNGSLTSPPC 240 

TDTVDWrVFK DTVSISBSQL AVFCEVLTMQ QSGYVMUIDY LONNPREQQY KFSRQVFSSY 300 

TGKEBIHEAV CSSEPBNVQA DPENYTSLLV TWERPRWYD TMIBKPAVLY QQLDGEDQTK 360 

HEFLTD6YQD L6AILNNLLP KMSYVLQIVA ICTMGLYGKY SDQLIVDMPT DKPELDLFPB 420 

LIGTSEZIKB ESE6KDIEBG AZVHPGRDSA T13QIRXXSPQ ISTTTBYNRZ GTKVNEAKIN 480 

RSPTKGSEFS GKGDVPMTSL NSTSQPVTKL ATEKDISLTS QTVTBX«PPBT VBGTSA8LHD 540 

GSKTVLRSPH KMLSGTABSL NTVSITEYEB ESLLTSPKli) TGAEDSSGSS PATSAIPPIS 600 

EKISQGYIFS SEm>GTITYD VLXPBSARKA SEDSTSSGSE ESLKDPSHEG HVWFPSSTDI 660 

TAQPDVGSGR BSFLQTNYTE IRVDESERTT KSFSAGPVMS QGPSVTDLSI PH YSTFA YFP .720 

TBVTPBAFTP SSRQQDLVST VNWYSOTTQ PVYHSASMSS HESRIGLABQ IiSSEKKAVZP 780 

LVXV8ALTFI CLWLVGZLZ YWRKCFQTAH PYLEDSTSPR VZSTPPTPZF PZS33DVGAZP 840 

IKEPPKHVAD LHASSGFTEE FEBVQSCTVD LGITADSSNH PDNKHKHRYZ KXVAYDHSRV 900 

KLAQIAEKDG KLTDYINANY VDGYNRPKAY lAAQGPLKST AEDFWRMIWB HNVEVIVMIT 960 

KLVEK6RSXC DQYWPADGSB EYGHFLVTQK SVQVLAYYTV 8NFTZ«H29TKX XKGSQK6RPS 1020 

GRWTQYBYT QHPDM6VPEY 8LPVLTFVRK AAYAKRHAVG FWVBCSAGV GRTGTYZVIO 1080 

SMLQQIQHEG TVNIPGFLKH IRSQRNYLVQ TBEQYVPIHD TLVEAXLSKE TEVLDSHIHA 1140 

YVMAIiLXPGP AGKTKLEKQF QLI>SQS£7IQQ SDYSAALKQC KREKNRTSSX XPVERSSVGX 1200 

SSLS6BGTDY XKASYXMGYY QSNEFXXTQH PLLRTXKDFVJ RMXHDEHAQL WMIFDGQNM 1260 

ABD5FVYHPN KDEPXHCBSP RVTLMAEEHK CIiSNEEKLXX QDFXLEATQD DYVZjEVREFQ 1320 

CPIWFNPDSP XSKTFSLISV IKEBAANRDG PMXVHDEHGG VTAQTFCALT TIMHQLEKEN 1380 

SVDVYQVAKM XNLMRP6VFA DXBQYQFLYK VIIjSLVSTRQ ESaPSTSLDS MGAALPDGNI 1440 
AESLESLV 



Seq ID NO: 185 DNA sequence 

Nucleic Acid Accession #: EOS sequence 

Coding sequencei 501-4514 

1 11 21 31 41 51 
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CACACATACM CAOGCAOSAT CTCACTTOGA TCTATACACT GGAGGATTAA AACAAACAAA 60 
CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC T6AGAAGCAG AGGAGCXX3CA 120 
CGGCGAGGGG CCGCAGACCG TCTGGAAATG CXSAATCCTAA AGCGrrTTCCT OGCTTGCATT 180 
CAGCICCTCT GTCTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 
CTTCTTGAAG AGATTGGCTG GTCCtATACA GGAOCACXGA ATCAAAAAAT TOGGGAAAGA 300 
AATATCCAAC ATCTAATAGC CCAAAACAAT CTCCTATCAA TATTGATGAA GATCTTACAC 360 
AAGTAAATGT GAATCTTAAG AAACTTAAAT TTCAGGGTTQ GGATAAAACA TCATTGGAAA 420 
ACACATTCAT TCATAACACT GGGAAAACAG TGGAAATTAA TCTCA CTAAT GACTACOGTG 480 
TCAGOGGAGG AGTTTCAGAA ATGGTCTTTA AAGCAACCAA GATAACTTTT CACTGG GGAA 540 
AATGCAATAT GTCATCTGAT GGATCAGAGC ATAGTTTAGA A6GAC AAAAA TTT0CACTT6 600 
AGATGCAAAT CTACTGCTTT GATGCGGACC GATTTTCAAG TlTrGAGGAA GCAGTCAAAG 660 
GAAAAGGGAA GTTAAGAGCT TTATCCATTT TGTTTGAGGT TGGGACAGAA GAAAATTTGG 720 
ATTTCAAAGC GATTATTGAT GGAGTOSAAA GTGTTAGTOO TTTTGGGAAG CAGGCTGCTT 780 
TAGATCCATT CATACTGTTC AACCTTCTGC CAAACTCAAC TGAC AAGTAT TACATTTACA 840 
ATGGCTCATT GACATCTCCT CCCTGCACAQ ACACMTTSA CTGGATTGTT TTTAAAGATA 900 
CAGTTAGCAT CTCTGAAAGC CAGTTGGCTG TTTTTTGTGA AG'WITACA ATGCAACAAT 960 
CXGGTTATGT CATGCTGATG GACTACTTAC AAAACAATTT TCX5AGAGCAA CAGTACAAGT 1020 
TCTCTAGACA Uy ' iVmiCC TCATACACTG GAAAGGAAGA GATTCATGAA GCAGTTTGTA 1080 
GTTCAGAACX: AGAAAATGTT CAGGCTGACC CAGAGAATTA TACC AGCCTT CTTGTTACAT 1140 
GGGAAAGACC TOGAGrOGTT TATGATACCA TGATrGAGAA GTTTGCAGTT TTGTA CCAg C 1200 
AGrroGATGG AGAG6ACCAA ACCAAGCATG AATTTTTGAC AGATGGCTAT CAA6ACTTGG 1260 
GTOCTATTCT CAATAATTTG CTACCC3UITA. TGAGTTATGT TCT^^ 1320 
GCACTAATGG CTTATATGGA AAATACRG06 ACCAACTGAT TGTOSACATG CCTACTGATA 1380 
ATCCTGAACT TGATCTTTTC CCTGAATTAA TTGGAACTGA AGAAATAATC AAGGAGGAGG 1440 
AAGAGGGAftA AGACATTGAA GAAGGCX3CTA TTGTGAATCC TGGTAGAGAC A6TGCXACAA 1500 
ACCAAATCAG GAAAAAGGAA CCCCA6ATTT CTACCACAAC ACACTACAAT CGCATAGGGA 1560 
CGAAATACAA TGAAGCCAAG ACTAACCGAT CCCCAACAAG AGGAAGTGAA TTCTCTGGAA 1620 
AGGGTCATC3T TCCSAATACA TCTTTAAATT CCACTTCCCA ACCAOTCACT AAATTAGCCA 1680 
CAGAAAAAGA TATTTCCrTO ACTTCTCAGA CTGTGACTGA ACTGCCACCT CACACTGTGG 1740 
AAGGTACTTC AGCCTCTTTA AATGATG6CT CTAAAACTGT TCTTAGATCT CCACATATGA IBOO 
ACTTGTCGGG GACTOCAGAA TCCTTAAATA CAGTTTCTAT AACAGAATAT GAGGAGGAGA 1860 
GTTTATTOAC CAGTTTCAAG CTTGATACTG GAQCTGAAGA TTCTTCAGGC TCCAGTCCCG 1920 
CAACTTCTGC TATCCCATTC ATCTCTGAGA ACATATCCCA AGGGTATATA TTTTCCTCCG 1980 
AAAACCCAGA 6ACAATAACA TATGATGTCC TTATACCAGA ATCTGCTAGA AATGCTTCCG 2040 
AAGATTCAAC rrCATCAGGT TCAGAAGAAT CACTAAAGGA TCCTTCTATG GAGGGAAATG 2100 
TGTGGTTTCC TAGCTCTACA GACATAACAG CACAGCCCGA TGTTGGATCA GGCAGAGA6A 2160 
GCTTTCTCCA GACTAATTAC ACTGAOATAC OTGTTGATGA ATCTGAGAAG ACAACCAAGT 2220 
CCTTTTCTGC AGGCaaGTG ATGTCACAGG GTOCCTCAGT TACAGATCTG GAAATGCCAC 2280 
ATTATTCTAC CTTTGCCTAC TTCCCAACTG AGGTAACACC TCATGCTTrT ACCCCATCCT 2340 
CCAGACAACA GGATTTGGTC TCCACGGTCA ACGTGGTATA CTCGCAGACA ACCCAACOOQ 2400 
TATACAAT6A GGCCAGTAAT AGTAGCCATG AGTCTCGTAT TGGTCTAGCT GAGGGGTTGG 2460 
AATCOtafiAA GAftaOCAGTT ATACCOCTTQ TGATCGTGTC AGOCCTGACI TTTAT CXGTC 2520 
TAGTGCTTCT TGTGGOTATT CTCATCTACT GGAGOAAATO CTTCCAQACT GCACACTTTT 2580 
ACTTAGAGGA CAGTACATCC CCTAGAGTTA TATCCACACC TCCAACACCT ATCTTTCCAA 2640 
TTTCAGATGA TGTOGGAGCA ATTCCAATAA AGCRCTTrCC AAAG CATGT T GCAGATTTAC 2700 
ATGCAAGTAG TGGGTTTACT GAAGAATTTG AGACACTGAA AGAQTTTTAC CAGGAAGTGC 2760 
AGAGCTGTAC TGTTGRCTTA GGTATTACAO CAGACAOCTC CAACCACOCA GACAACAAGC 2820 
ACAAGAATCG ATACATAAAT ATCX5TTGCCT ATGATCATAC CA6GGTTAAG CTAGCACAGC 2880 
TTGCTCAAAA GGATGGCAAA CTGACTGATT ATATCAATGC CAATTATGTT GATG6CTACA 2940 
ACAGACCAAA AGCTTATATT GCTOCCCAAG GCCCACTGAA ATCCACAGCT GAAGATTTCP 3000 
GGAGAATGAT ATGGGAACAT AATGTGGAAG TTATTGTCAT GATAACAAAC CTCGTGGAGA 3060 
AAOGAAGGAG AAAATGIGAT CAGTACTGGC CTGCGGATGQ GAOTGAOGAG TACS3GGAACT 3120 
TTCTG6TCAC TCAOAAGAGT GTGCAAGT6C TTGCCTATTA TACTGTGAQQ AATTTTACTC 3180 
TAAGAAACAC AAAAATAAAA AAGGGCTCCC AGAAAGGAA6 ACOC3U3TGGA CGTGTGGTCA 3240 
CACAGTATCA CTACACGCAG TGGCCTGACA TGGGAGTACC AGAGTACTCC CTGCCAGTGC 3300 
TGACCTTTGT GAGAAAGGCA GCCTATGCCA AGCGCCATGC AflTOQGGCCT GTTGT06TCC 3360 
ACTGCAGTGC TGGPyGrrrGOA AGAACAGGCA CATAIATTGT GCXAGACAGT ATGTTG CAGC 3420 
AGATTCAACA CGAAGGAACT GTCAACATAT TTG6CTTCTT AAAACACATC CGTTCAC AAA 3480 
6AAATTATTT 6GTACAAACT GAGGAGCS^T ATGTCTTCAT TCATGATACA CTGGTTGAGG 3540 
CCATACTTAG TAAAGAAACT GAGGTGCTGG ACAGTCAXAT TCATGCCTAT GTTAATGCAC 3600 
TCCTCATTCC TGGACCAGCA GGCAAAACAA AGCTAGA6AA ACAATTCCAG CTCCTGAGCC 3660 
A6TCAAATAT ACA6CAGAGT GACTATTCTG C3VGCCCTAAA GCAATGCAAC AGGGAAAAGA 3720 
ATC6AACTTC TTCTATCATC CCTGTGGAAA GATCAAGOOT TGGCATTTCA TCCCTGAGTG 3780 
GAGAAGGCAC AGACTACATC AATGCCTCCT ATATCATGGG CTATTACCAG AQCAATGAAT 3840 
TCATCATTAC CCA6CACCCT CTCCTTCATA CCATCAAGGA TTTCTGGAQG ATGATATGGG 3900 
ACCATAATGC CCAACTGGTG GTTATGATTC CTGATGGCCA AAACATGGCA GAAGATGAAT 3960 
TTOTTTACTG GCCAAATAAA GATQA6CCTA TAAATTGTGA GAGCTTTAAG GT CACT CTTA 4020 
TGGCTGAAGA ACACAAATGT CTATCTAATG AGGAAAAACT TATA ATTCAG GACTTTATCT 4080 
TAGAAGCTAC ACRGGATGAT TATGTACTTG AAGTGAGGCA CTTTCA0T6T CCTAAATGGC 4140 
CAAATCCaGA TAGOCXCATT AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAGAAGAAG 4200 
CTGCCAATAG GOATGGGCCT ATGATTGTTC ATGAT6AGCA TGGAGGAGIQ ACGGCAGGAA 4260 
CTTTCTGTGC TCTGACAACC CTTATOCACX: AACXAGAAAA AGAAAATTOC GTGGATGTTT 4320 
ACCAG6TAGC CAA6ATGATC AATCTGATGA GGCCAGGAGT CTTTGCTGAC ATTGAGCM3T 4380 
ATCAGTTTCT CTACAAAGTG ATCCTCAGCC TTGTGAGCAC AAG6C3W36AA GAGAATCCAT 4440 
CCACCrCTCT GGACAGTAAT GGTCCAGCAT TGCCTGATGG AAATATAGCT GAGAGCTTAG 4500 
AGTCTTTAQT TTAACACAGA AAGGGGTGGG GGGACTCACA TCTGAGCATT GTTTTCCTCT 4560 
TCCTAAAATT AGGCAGGAAA ATCAGTCTAC TTCTGTTATC TOTTGATTTC O CATCACC TO 4620 
ACAGTAACTT TCATGACATA GGATTCTGCC GCCAAATTTA TATCaiTAAC AATGTGTGCC 4680 
TTTTTCCAAG ACTTGIAATT TACTTATTAT GTrTGAACTA AAATGATTGA ATTTTACAGT 4740 
ATTTCTAAGA ATGGAATTCT GGTATTTTTT TCTGTATTQA TTTTAACAGA AA ATTTCAA T 4800 
TTATAGAGGT TAGGAATTCC AAACTACAGA AAATGTTTGT TTTTAGTGTC AAATTTTTAG 4860 
CTGTATTTGT AGCAATTATC AGGTTTGCTA GAAATATAAC TTTTAATACA OTAfiCCTGTA 4920 
AATAAAACAC TCTTOCATAl GATATTCAAC ATTTTACAAC TQCAOIATTC ACCTAAAGTA 4980 
GAAATAATCT GTTACrrATT OTAAAIACPG CCCTAOTGTC TCCATGGACC AAATTTATAT 5040 
TTATAATTCT AGATTTTTAT ATTTTACTAC TGAGTCAAGT TtTTOGTTC TGTGTAATTC 5100 
TTTAGTTTAA TGAOOTAflTT CATTAGCTOa TCTTACTCTA CCAGTTTTCT GACATTGTAT 5160 



261 



wo 02/086443 

TGTGTTACXrr AAGTCATTAA CTTTGTTTCA GCATGTAATT TTAACTTTTG TGGAAAATAG S220 

AAATACCTTC ATTTTGAAAG AAC5TTTTTAT GAGAATAACA CCTTACCAAA CATTGTTC3U 52 BO 

A7GGTTTTTA TCCAAGGAAT TGCAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAA 

Seq ID NOi 186 Protein sequence: 
Protein Accession 9» EOS sequence 

1 11 21 31 41 51 

1 I I I I I 

MVFKASKITF HHGKCNMSSD GSSISZiEGQR FPLa4QZYCP DADRPSSFEE AVKGKGKLRA 60 

LSZZ.FEVGTB E19U}F1CAXID 6VBSVSRFGK QAAUSPFZLI* NIaLFKSll>KY YIYHGSLTSP 120 

PCTOTVDWIV PKDTVSISES QLAVFCBVLT MQQSGYVMIM DYLQHKFKBQ QYKPSRQVFS 180 

SYTGKEEIHB AVCSSBPENV QADPENYTSL LVTWBRPRW YDTMIBKFAV LYQQLDCEOQ 240 

TKHEFLTDGY QDLGAI LNZni LPNMSYVliQI VAICTNGLYG KYSDQLIVDM PTDHPELDZiF 300 

PBLIGTEEZI KEBEE6KDIE EGAIVNPGRD SAIKQIRKKE PQISmHYM RIGTKXKEAK 360 

niRSPTRGSE FSGK(a>VPHT SUISTSQFVT KCiATERDISL TSQTVTELPP BTVEGTSASXi 420 

NDGSKTVLRS PEMNLSGTAE SUTTVSITEY EEESLLTSPK LDTGAEDSSG SSPATSAIPF 480 

ISQnSQGYI FSSENPETIT YDVLIPBSAR NASEDSTSS6 SEESUCDPSM DGNVHFPSST 540 

DITAQPDVGS GRESFLQTNY TEIRVDBSEK TTKSFSAGPV KSQGPSVTDL EMPBYSTFAY 600 

PPTEVTPHAF TPSSRQQDLV STVMWYSQT TQPVYNBASN SSBESRIGLA BGLBSEKRAV 660 

ZPLVIVSALT FICLWLVGI LIYNRKCFQT ARFYLEDSTS PRVZSTPPTP IFPZSDDVGA 720 

IPIKHFPKHV ADUIASS6FT EEPETLKEFY QEVQSCTVDL GITADSSNHP DNKHKNRYIN 780 

ZVAYDHSRVR LAQLAEKZX^K LTDYINANYV D6YNRPKAYI AAQGPLKSTA EDFWRMIHEH 840 

NVEVIVHITN LVEKgRRKCD QYHPADGSEB YGNFLVTQKS VQVLAYYTVR KFTLRNTKZK 900 

KGSQKGRPSG RWTQYBYTQ NPDKGVPSYS XfVLTFVRXA AYAKRBAVGP VWHCSAGVG 960 

RTGTYIVXiDS MLQQZOHBGT VNZFGFLXHZ RSQRKYLVQT EEQYVFZHDT LVEAILSKET 1020 

EVIiDSHIBAY VNALLZPGPA GKTKLEKQFQ LLSQSZTIQOS DYSAALKQOf REKNRTSSII 1080 

PVERSRVGIS SL8GBGTDYI MASYIK6YYQ SNEFIITQHP IiLHTIKDFWR MXWDHNAQLV 1140 

VMIPDGQKKA EDBFVYHPNK DEPINCBSFX VTU4AEEHKC LSNEEKLIIQ DFILEATQDD 1200 

YVLBVRKFQC PXHPNFDSPZ SKTFEIiISVI KEEAANRDGP NZVHDBR6GV TAGTFCALTT 1260 

IMHQLEKENS VDVYQVAKMZ MLMRPGVFAD ZSQYQFLYKV ZZiSLVSTRQB BIPSTSU3SK 1320 
GAALPD6NIA ESLBSLV 

Seq ID NO: 187 DMA sequence 

Nucleic' Acid Accession #i EOS sequence 

OcxUng sequence t 148-4632 

1 11 21 31 41 51 

I I I I I t 

CACACATAG8 CACGCAOQAT CTCACTTGOA TCTATAGACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTC6 CTCCCCCTOC CTCTCCACTC TGAGAA6CA0 AGGAGCCGCA 120 

CGGOSAGGGO CCGCAGACCG TCTGGAAATG OGAATCCTAA AAOGTTTCCT OGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACA6ACA ACAGAGAAAA 240 

CTT6TTGAAG AGATTGGCT6 GTCCTATACA G6A6CACTGA ATCAAAAAAA TTGQ6GAAAG 300 

AAATA7CCAA CATGTAATA6 CCCAAAACAA TCTCCTATCA ATATTOATaA AOATCTTACA 360 

CAAGTAAATG T6AATCTTAA GAAACTTAAA TTTCA66GTT GGQATAAAAC ATCATT6GAA 420 

AACACATTCA TTCATAACAC TOGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCR AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TX3GATCA6AG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC OGATTTTCAA GTTTTQAGQA AGCA6TCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGG6ACA6A AGAAAATTT6 720 

GATTTCAAAG OGATTATTGA TGGAOTCGAA AGTGTTAGTC G TTTT G GGAA GCAG6CTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGOCrCAT TGACATCTCC TCCCTGCACA GACAC3W3TTG ACTGGATTOT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT G ' riTmtj ' i ' G AAGTTCTTAC AATGCAAC3A 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTOTTTTC CTCATACACT GGAAAGGAAO AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATCJT TCAQOCIGAC CCAGAGAATT ATACC3WSCCT TCTTGTTACA 1140 

TQGGAAA6AC CTOQAGTCGT TTATGATACC ATGATTGAGA AGTTT6CAGT TTTGTACCAG 1200 

CAGTTGGATG GAQA6GACCA AACCAAGCAT GAATTTTTGA CAGAT66CTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACOCAAT ATGAGTTATQ TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG 6CTTATAT6G AAAATACA6C GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTQA AGAA6G06CT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TOGCATAGOG 1560 

AOGAAATACA AIGAAGCCAA GACTAACQGA TCCXCAACAA GAGGAAGTQA ATTCTCTGGA 1620 

AAGGGTGATQ TTOCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAO ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

.GAAGGTACTT CAGCCTCTTT AAATGATG6C TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAG6AG 1860 

AGTTTATTGA C3CAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATOOCATT CaTCTCTGAG AACATATOOC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATAT6ATGTC CTTATACCAO AATCIGCTAO AAATGCTTCC 2040 

6AAGATTCAA CITCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGRGG6AAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCOG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA 0GT6TTGATG AATCTGAGAA GACAACCAAG 2220 

Tocrm'c m caggocxagt gatgtcacag ggtccctcag ttacagatct ggaaatgcca 2200 

CATTATTCTA CCTTTOCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATrTGGT CTCCAOGGTC AAOGTGGTAT ACTCGCA6AC AACCCAAC06 2400 

GTATACAATG AGGCCAGTAA TA6TAGCCAT GAGTCTOGTA TTGGTCTAGC TGAGGGGTTG 2460 

GAATCOGAQA AGAAGGCAOT lATACOCCTT GTGATOGTGT CA60CCT6AC TTTTATCTGT 2520 

CTAGTGGTTC TTQTOQGTAT TCTC AT C I AC TG6A6GAAAT GCTTCCAGAC TGCACACTTT 2580 

TACTIAGAGG ACAGTACATC OCCTAGAGTT ATATOCACAC CTCCAACAC3C TATCTTTCCA 2640 

ATTTCAGATG ATGTOGGAGC AATTOCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 2700 

CATGCAAGTA GTG6GTTTAC T6AAGAATTT 6AGACACTGA AAGAGTTTTA CCAGGAAGTG 2760 

CA6A6CTGTA CTGTTGACIT AGGTATTACA GCAGACAGCT CCAAOCACXX: AGACAACAAG 2820 
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CACAAGAATC GftTACATAAA TATOGT T GCC TATGATCUTA 6CAGGGTTAA 6CTAGCACAG 2880 

CTTCCTGAAA AGGATGGCAA ACTGACT6AT TATATCAAT6 OCAATTATGT T6ATCGCTAC 2940 

AACAGACCAA AAGCTTATAT TGCTGCCCAA G6CCCACTGA AATCCACAGC TGAAGATTTC 3000 

TGGAGAATGA TATGGGAACA TAATGTGGRA GTTATTGTCA TGATAACAAA CCTCGTGGAG 3060 

AAAGGAAG6A GAAAAT6TGA TCA6TACTG6 CCTGCC6ATX3 GQAST6AGGA GTA0S6GAAC 3120 

TTTCIGGTCA CTCAGAAGAG T6T6CAAGTG CTTGCCTATT ATACTGTOAG GAATTTTACT 3180 

CTAAGAAACA OVAAAATAAA AAAG66CTCC CA6AAA6GAA GACGCAGT66 AO G TGTGGTC 3240 

AC3W»GTATC ACTACACGCA GTGGCCTGAC ATGGGAGTAC C3W3A(3TACTC CCTGCCAGTG 3300 

CTGACCTTTG TGAGAAAGGC AGCCTATGCC AAGCGCCATG a«3TGGGGCC TGTTGTCGTC 3360 

CACTGCAGTG CTGGAGTTGG AA6AACA6GC ACATATATTG TGCTAGACAG TATGTTGCA6 3420 

CAGATTCAAC A0GAAG6AAC TGTCIIACATA TTTG6CTTCT TAAAACACAT CGOTTCACAA 3480 

AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGA6 3540 

GCCATACTTA GTAAAGAAAC TGAGGTGCTG GACAGTCATA TTCATGCCTA TGTTAATGCA 3600 

CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAA7TCCA GGGTCTCACT 3660 

CTGTCACCCA GOCTGGAGTG CAGA6GCACA ATCTOGGCTC ACTGCAACXT TCCTCTCCCT 3720 

GGCTTAACTG ATCCTCCTAC CTCAGCCTCC OGAGTGQCTG GGACTATACT 0CTGA6CCA0 3780 

TCAAATATAC AGCAGAGTGA CTATTCTGCA GCCCTAAAGC AATGCAACaO GGAAAAGAAT 3840 

OOAACTTCTT CTATCATCCC TGTGGAAAGA TCAAGGOTTS GCATTTCATC CCTGAGTGGA 3900 

OAAGGCACAG ACTACATCAA TGCCTCCTAT ATCATGGGCT ATTACCAGAG CAATGAATTC 3960 

ATCATTACCC AGCACCCTCT CCTTCATACC ATCAAGGATT TCTGGAGGAT GATATGGGAC 4020 

CATAATGCCX: AACTGGTGGT TATGATTCCT GATGGCCAAA ACATGGCACA AGATGAATTT 4080 

GTTTACXX3GC C3UUITAAAGA TGAGCCTATA AATTGTGAGA GCTTTAAGST CACTCTTATG 4140 

GCTGAAGAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 4200 

GAAGCTACAC AGGATGATTA T6TACTTGAA GTSAGGCACT TTCAGTGTCC TAAATGGCCA 4260 

AATCCAGATA GCCCCATTAG TAAAACTTTT GAACTTATAA GTGTTATAAA AGAAGAAGCT 4320 

GCCAATAGGG ATGGGCCTAT GATTGTTCAT GATGAGCATG GAGGAGTGAC GGCAGGAACT 4380 

TTCTGT6CTC TGACAACCCT TATGCACCAA CTAGAAAAAG AAAATTCOQT QGATGTTTAC 4440 

CAGGTAOCCA A6ATGATCAA TCTGATGAGG CCAGGAGTCT T7GCTGACAT TGAGCAGTAT 4500 

CAGTTTCTCT ACAAAOTGAT CCTCAGCCTT 6TGGGCACAA G6CAGGAAGA GAATCCATCC 4560 

ACCTCTCTGG ACAGTAATGG TGCAGCATTG CCTGATGGAA ATATAGCTGA GAGCTTAGAG 4620 

TCTTTAGTTT AACACAGAAA GGGGTGGGGG GACTCACATC TGAGCATTGT TTTCCTCTTC 4680 

CTAAAATTAG GCAGQAAAAT CAGTCTAGTT CTGTTATCTG TTGATTTCCC ATCACCTGAC 4740 

AfiTAACTTTC AtGACATAGO ATTCTGCOGC CAAATTTATA TCATTAACAA TGTGTGCCTT 4800 

TTTOCAAGAC TTGTAATTTA CTTATTATGT TTGAACTAAA ATGATTGAAT TTTACAGTAT 4860 

TTCTAAGAAT GGAATTGTGG TATTTTTTTC TGTATTGATT TTAACAGAAA ATTTCAATTT 4920 

ATAGAGGTTA GGAATTCCAA ACTACAGAAA ATGTTTGTTT TTAGTGTCAA ATTTTTAGCT 4980 

GTATTTGTAG CAATTATC3«3 GTTTGCTAGA AATATAACTT TTAATACAGT AGCCTGTAAA 5040 

TAAAACACTC TTCCATATGA TATTCAACAT TTTACAACT6 CAGTATTCAC CTAAAOTAGA 5100 

AATAATCTGT TACTTATTOT AAATACTGOC CTAOTGTCTC CATOffllCCAA ATTTATATTT 5160 

ATAATTGTAG ATTTTTATAT TTTACTACTG AGTCAA6TTT TCTAGTTCTG TGTAATTGTT 5220 

TAGTTTAATG ACGTAGTTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CATTGTATTG 5280 

T6TTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTTT AACTTTTGTG GAAAATAGAA 5340 

ATAOCTTCAT TTTGAAAGAA GTTTTTATGA GAATAACAOC TTACCAAACA TTGTTCAAAT 5400 

6GTTTTTATC CAAGGAATTG CAAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA 5460 
AAAAAAAAAA AAAAAAAAAA A 

Seq ID NO: 188 Protein seijuence: 
Protein Accession #i 809 sequence 

1 11 21 31 41 51 

) I I I I I 

MRZIiKRFLAC IQUiCVCRLD HANGYYRQQR KLVEEIGWSY TGAUiQKNWG VXmOHSPK 60 

QSPIKIDEDL TQVNVNLKXZi KFQ6HDKTSL BNTFZBKTGX TVEniLTNDY RVSGGVSEMV 120 

FXASXITFHW GKOIMSSDQS EaXSIiBOQKFP LEMQIYCFDA DRF8SFEEAV KSXGKIiRALS 180 

IZiFBVGTEEN LDFKAIIDGV ESVSRFGKQA ALDPFILLNL LPHSTDKYYI YNGSLTSPPC 240 

TDTWDHIVFK DTVSISESQL AVPCBVLTMQ QSGYVMLMDY LQNNPREQQY KPSRQVPSSY 300 

TQKEBZHEAV CSSEPENVQA DPENYTSLLV TWERPRWYD TMIEKFAVLY QQIiDGEDQTK 360 

BEFLTDGYQD LGAILIQILLP NMSYVLQIVA ZCTNGLY6KY SDQLZVDMPT DHPELOLFPB 420 

LIGTEBZZKB BBBGKDIEEG AIVNPGSOSA TOQZRKKEPQ ZSTTTHYKRZ GTKYNEARTN 480 

RSPTRGSBPS GKQDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VBGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPPIS 600 

ENISQGYZFS SSHPETZTYD VLZPESARHA SEDSTSSGSE ESLKDPSMB6 NVWFPSSTDZ 660 

TAQPDVGSC^ ESFLQTNYTB ZRVDESBKTT RSFSAGFVKS QGPSVTDZiEM PHY 5TPA YFP 720 

TBVTPHAPTP SSRQQDLVST VNWYSQTTQ PVYUEASMSS HESRZGLAEG LESEKKAVZP 780 

LVIVSALTPI CLWIiVGILI YWRKCPQTAH PYLEDSTSPR VISTPPTPIP PZSDDVGAZP 840 

IKHFPiOIVAD LHASSGFTEE PETLKEFYQB VQSCTVDLGI TADSSNHPDN KHKNRYINIV 900 

AYDHSRVKLA QIAEKDGKLT DYZMANYVDG YNRPKAYZAA QGPLKSTAED PMBMIWBBKV 960 

.EVIVMITNLV EXGRRKCDQY WPADGSSEYG NFLVTQKSVQ VLAYYTVRHF TLSNTKZKKG 1020 

SQKGRPSGRV VTQYHYTQWP DMGVPBYSIiP VLTFVRKAAY AKRHAVGPW VHCSA6V6RT 1080 

GTYIVUJSMb QQIQHEOTVN IPGFLKHIRS QRNYIiVQTEE QYVFIHDTLV EAILSKBTBV 1140 

LDSHIHAYVN ALLIPGPAGK TKLEKQFQGL TLSPRLECRG TISAHQJIiPIi PGLTDPPTSA 1200 

SRVAGTILLS QSKZQQSDYS AALKQQIREK NRT8SZZFVB RSRV6ZSSLS GEGT DYZMA8 1260 

YXKGYYQSNB FZZTQBPLLB TZKDFNRMIW DHNAQLWNZ FDQQKMAa)B FVYHFNKDEP 1320 

ZKCSSFKVTL MAEEHKdiSN EEiCLIIQDFZ XiBATQDDYVL EVRHFQCPKN FNPDSPZ8KT 1380 

FBLZSVZXEB AAmOlGFKZV EDEBGGVTAO TFCAIiTTLKB QLEKaVSVDV YQVAKMXHIM 1440 
RPGVFADIEQ YQFLYKVZI.S LVGTRQEBHP STSLDSHGAA LPOOTZAESL E5LV 



Seq ID NO: 1B9 DNA sequence 
Nucleic Acid Accession ft: HM_003620 
coding sequence! 304.. 831 

1 11 21 31 41 

I I I I 1 ^ 

C0G6TT06CA AAGAAGCT^ CTTC31SAG0G G6AAACTTTC TTCTTTTAGG 
00CT9TTCCA GGAACCCAGG AOAACTGCTG GCCAGATTAA TTAGACATTG 
OGTGTAAACA CACTACTTAT CATTGATGCA TATATAAAAC CATTTTATTT 



51 

1 

AGGCGGTTAG 60 
CTATGGGAGA 120 
TCGCTATTAT 180 



263 
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TTCAGA6GAA GOGOCTCTGA 
GTTTGGAGAA A6CACAGTTG 
AGCATGCAGC GGAGACTGGT 
GfTGCCCTCCT GCC3GGGGCTC 
GAACATCAGC TCCTCCATGA 
CTTCAOCATC TGTlTOGCAGft 
CCTAACTCOl AGCCCTCTCC 
GA6GGCAGAT ACCTAACTCA 
AAGACACCTG GGAAGAAAAA 
AAA0G60CSAA CTOGCTCTGC 
GAOCACCTGT CTGACAOCTC 
CT6GCC0STA GCCTCA6CX36 
GCTTGGACAA ACCTAGAATT 
CAGAGAATAA CTCAGAATAT 
TGTCCTCCAG CACCATAQAQ 
CATC3VATCCT TTAOCACTCT 
ATCTTCATAA TTT6CTGGAG 
TTCTTCAGTX3 TTTTTCATTT 
GATATTATCT ACAAACACTG 
ACTTTTTATT TAATTAAATG 
TAAATTATGT TTTAAACACA 
CCAGCTCATA CAAAATAAAT 
G6TTTTTCTC ATGTATCTTT 
CaTTAOGAAA AATAAAACTT 



PCT/US02/12476 



vntimcsv 

GAGTAQCCGQ 
TCAGCAGTGG 
G(?rGGAGGGT 
CAAOGGGAAG 
AATGCACACA 
CAACACAAAG 
GGAAACTAAC 
GAAAGGCAAG 
CTGGTTAGAC 
CACAA0GT06 
GGTGCTCTCA 



A6Q06CTAQA 
AOCAAATAAT 
AAGTGTATTT 

CTTAOGTTCT 
CAGAACAGCA 
TATTTAATTA 
TGCCTTAAAT 
GGTTTCTGAA 
TTGTTCATTG 
CACATTTAAA 



TTTTCOCTTT 
TTGCTAAATA 
AG0GT0GG6G 
CTCAGCCGCX: 
TCCATCGAAG 
6CTGAAATCA 
AACCACCCOG 
AAG6TGGAGA 
CCOGGGAAAC 
TCTGGACalGA 
CTG6AGCT0G 
GCTOGGTTTT 
ATGTATCTCT 
AAA6CAGTAC 
GCCCATTOCT 
TTCATATTCA 
CTTCOCCTTA 
TTCACTTCAA 
TCATGTCATA 
AATCTCAAAT 
TTGTTTAATT 
AATGTTTAAG 
GCAAGAT6AA 



TTGCrCTTTC 
AGTCCGGA6C 
TGTTCCTGCT 
6CCTCAAAAG 
ATTTA0G6CG 
GAiKTACCTC 
TOOGATTTGG 
CGTACAAAGA 
GCAAGGAGCA 
CTGG6AGTG6 
ATTCACGGTA 
GGA6CCTCCC 
ATCGATTGTG 
CCCCCTACCA 
CTTTCrcCAC 
A6CTTCA6AA 
CTCtCACACC 
GGGABAATAT 
AAOQATTCTG 
TTATTTTAAT 
AAATTTAACT 
TATTAACTTA 
ATAATTTTTC 



TGGCTGT6TG 
G0GAGC6GAG 
GAGCTACGC3G 
AGCTGTGTCT 
AOGATTCTTC 
GGAGGTGTOC 
GTCTGATGAT 
6CAGC06CTC 
GGAAAAGAAA 
GCTAGAAG66 
ACAGCSCTTCT 
TTCTGCCTT6 
TAGCAATT6A 
CACACACCCC 
CGTCACCCAA 
GCTAG7GACC 
TGGGCAAACr 
AGAAGCATTT 
AGCCATTCAC 
GTAAAGAACT 
CTGGTTTCTA 
CAAGGATATA 
TAGG6TAATG 



Seq ID NO: 190 Protein sequence: 
Protein Accession 8: HP_002811 



11 



41 



5X 



21 31 

I I I I I I 

NQRSLVQQHS VAVFLIiSyAV PSOSRSVEGL SRRLXRAVSB BQLLBDKGKS XQDLRRRFFL 
RHLIAEIHTA EXRATSEVSF KSKPSPNTKff RPVRF6SDDS GRYLTOBniK VETYKEOPLR 
TPGKKXKGKP GKRKEQBKKK RRTRSAHLDS GVTGSGLGGD HLSDTSTTSL ELDSR 

Seq ID HO: 191 DHk sequence 
Nucleic Acid Accession «: XM_059328 
Coding sequence: 52.. 1023 



GGGCTGTCOQ 
CCTOGCATCC 
GGTAT0GTG6 
GCGGCCACGG 
GCCAACCTGT 
GGC00G6AAG 
GT6GATTTGC 
CTG6GCAGGG 
TGCCAGGTGT 
6AG060G6TG 
GTG6AGC36Q6 
GAO GCCT TOG 
GCCCTGGCGC 
G06CACCCCG 
TTCTCTTGCT 
6CCCAGCTT6 
CCAGGGGAGG 
TGACCXXXTTA 
TTAGTCCTGG 
GG ACACTOCX: 
AGCCTTCTTS 
TGGTGCOCCT 
CTATATTAAT 



11 
I 

6CCCACTCCC 
6CCT6GTGGT 
AGGCCTTTCT 
AGAGCGCGGC 
CCGAGG6CGG 
GCTTCTTCCT 
CTCASGTGOG 
CCCCCAOGCA 
TCGCCGAGGC 
TGGGTGGCTG 
A06CC0QGQC 
TGG6CCTGA6 
GGGTCCTGGA 
GCTACCCCA6 
CTTG6GAG06 
CCO^GGATGG 
A6GTCCCCTG 
CAGACAACCA 
CCCAGCCCAG 
ACCTCTGGGC 
GCTGCAGGCA 
CCATGTTGCA 
AAAATAAOGT 




GGCOGGGGCT 
GGAGCTGGCC 
CCXX^GTGGGT 
TQGCAAGATO 
G6A0GA6CTC 
CGCXX3ACGGG 
GCTGCAGGCC 
CACTTGGCTG 
0600GTGGGC 
CACTTGCGGC 
AGGTACCCTA 
TGTGCCTCCC 
GCT6CATGAG 
06T6CAGCTT 
TGAGCCCACT 
AGCACTAATC 
AGCTGGGACC 
TC3M3GTCCTC 
6GCCTAGCCT 
ATOCAAACAC 
GTGTCTTTC 



31 

I 

GA60QGTGGA 
QACTTTGGTT 
GTGACCAGOG 
CGCAGGCACA 
CCGGCCCX3CC 
GGATTCOGGG 
6AGGCXXAAC 
CACCAGCAC3G 
TATGGGGTGC 
GAGGGCCCOG 
CCCTTCTOCC 
C6GCACATGT 
GOGGGCCACA 
ACCGGOGGCT 
CTGOGOGTCC 
TGOQCCCTOG 
CTGGAACCCT 
CCCTTAGTAC 
TGGAGCAOGA 
ATGCCTCCAA 
GTGGCA6CX36 
CTTCACCACT 



41 

I 

CCCAGGOGGC 
ACT6CC0S06 
TGTCCCTGCT 
GCATCCCCAC 
GT6GCGGCTC 
AG606(3TG6C 
TAA6CT6CTT 
TGCAOGTGCT 
GCTTTACGCG 
CGCQTGCCTT 
G^OSGCCT 
COGCTCACOG 
CCCTGACAGC 
GCGGTGAAGG 
TCAC0606CC 
A08ACCTGGA 
TCCTG6AACC 
CAAGAAAGGG 
TCTGTTGACT 
ATGGCATCTA 
GCTAGGGCOC 
GGGGCAGTSO 



51 
I 

CATGTCC06C 
AOGCGATGAS 
6GTCAACGGT 
GGGCCTCCAC 
ATCGCTGCTC 
GGCG6GAGAC 
00GG6AGCT6 
CCCAGGOGTG 
ACTGCCGCTG 
06CCTGCGCC 
6CG6TGGACA 
CXSTGTCOGGG 
CGAGCTGATG 
CCCCGACGCT 
GACGCTGGGG 
CTCCAASA66 
CTCCCTACTC 
GAGCCAGGAT 
TCCCTGGGTA 
GAGTTTGA6C 
OCAGAOCATT 
aOAaAGATGG 



Seq ID NO: 192 Protein sequence: 
Protein Accession #i XP 059328 



51 



1 11 21 31 41 

I I I I I I 

MSRPRMRLW TADDFCYCFR RDB6IVEAFL AGAVTSVSLL VNGAATESAA SLARRBSZPT 
GLEAHLSB6R FVGPAKRGAS SLIiGFEGFFL GKMSFKEAVA AGDVDLPQVR EELEAQLSCP 
RELLGRAPTH ADGHQHVHVL PGVOQfVPAEA LQAYGVRFTO LPLERGVGGC TWLEAPARAP 
ACAVERDARA AVGPPSRHGL RWTDAFVGLS TCX3RHKSAHR VSGALARVLE GTLAGRTLTA 
ELMAHPGY^S VPPTGGCX3B6 PDAFSCSNER LHELRVLTAP TLRAQLAQD6 VQIiCAU>DIiD 
SXRP6BEVPC EPTLBPFLEP SLL 

Seq ID KOi 193 ONA sequence 

Nucleic Acid Accession «: NM_005668.1 

Coding sequence: 126. .4439 



240 
300 

360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



60 
120 
180 
240 
300 



31 



41 



51 



1 11 21 

! 1 I i I I 

COGGGCAGGT GGCTCATGCT CGGGAGCGTG GTTGAGCGGC TGGCGCGGTT GTCCTGGAGC 
AGQGGCGCA6 GAATTCTGAT GTGAAACTAA CAGTCTGT6A 6CCCT6GAAC CTCCGCTCAQ 



60 
120 



264 



wo 02/086443 

AGAAGMGAA GQATATGGftC ATAOGAAAAG A6TA1ATCAT OCCC3U3TCCT GGQXATAGAA 180 

GTCTGAGGQA GAGAAGCAGC ACTTCTOGQA CGGACAOMIA CQGT GA AGAT TCCAAGTTCA 240 

GGAGAACTOS ACOGTTGGAA. TGCCAAGAT6 CCTTG6AAAC AGCAGCCX3GA GCXSGAGGGCC 300 

TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT GGATGAGGAG CATCCCAAGG 360 

GAAA6TACCA TCATGGCTTG AGT6CTCTGA AGCCCATCOG GACTACTTCC AAACACCAGC 420 

ACOCAGTOGA CAAT6CT0G6 Cmv n Vir V 6TATGACTTT TTOGTGGCTT TCTTCTCTGG 480 

CC06TGTGGC CCACAAGAAG 6GGGAGCTCT CAATGGAAGA OGTGT GG TCT CTGTCC A AGC S40 

AOGAGTCITC TGAOGTGAAC TGCAGAAGAC TA(3U3AGACT GTG6CAAGAA GAGCTGAATG 600 

AAGTTGOGCC AGACGCTGCT TCCCTGOGAA GGGTTCTrGTG GATCTTCTGC OSCACCAGGC 660 

TCATCCTGTC CATOGTGTGC CTGAT6ATCA CX3CAGCTGGC TGGCTTCAGT GGACCA6CCT 720 

TCKIGGTGAA ACHOCTCrrO 6AGTATACCC A6QCAAGAGA GTCTAAOCTG CAGTACAGCT 780 

TGTTGTTAGT GCTOGGCCT C CTCCTGACXSG AAATOSTGOS GTCTTGGTOG CTT6CACT6A 840 

CTTGGGCATT GAATTACCGA ACOGGTGTCC GCTTGOGGOG GGCCATCCTA ACCATGGCAT 900 

TTAAGAAGAT CCTTAAGTTA AAGAACATTA AAGAGAAATC CCTG6GTGAG CTCATCAACA 960 

TTTGCTCCAA CGATGGGCAG AGAATGTTT6 AGGCAGCA6C CGTTGGCAGC CTGCTG6CT6 1020 

6AG6ACC0GT TGTTGOCATC TTAGGCATGA TTTATAATGT AATTATTCTO GGACCAACAO 1080 

GCTTCCTQGG ATCAGCTGTT TTTATCCTCT TTTAOCCAGC AATGATGTTT GCATCACG6C 1140 

TCACAGCATA TTTCAOGAGA AAATGOGTGG C0GCCACX3GA TGAAOSTGTC CAGAAGATGA 1200 

ATXSAA6TTCT TACTTACATT AAATTTATCA AAATGTATGC CTGGGTCAAA GCATTTTCTC 1260 

AGAGTGTTCA AAAAATCOGC GAGGA6GAGC GTOGGATATT GGAAAAA6CC GGGTACTTCC 1320 

AGGGTATCAC TGTG6GTGT6 QCTGOCATTG TGGTGGT6AT TGCCAGGSI6 GTGAOCTTCT L380 

CTGTTCATAT GACXXTTGGGC TT06ATCTGA CAGCAGCACA G6CTTTGACA 6TGGTGACAG 1440 

TCTTCAATTC CATGACTTTT GCTTTGAAAG TAACAC0C5TT TTCAGTAAAG TCCCTCTCAG 1500 

AAGCCrCAGT GGCTGTIGAC AGATTTAAGA GTTTGTTTCT AATGGAAGAG GTTCACATGA 1560 

TAAAGAACAA ACCAGCCAGT CCTCACATCA A6ATAGAGAT GAAAAATGCC ACCTTGGCAT 1620 

GGGACTGCTC CCACTCCAGT ATCCAGAACT CGCCCAAGCT GACCCCCAAA ATGAAAAAAO 1680 

ACAAGAGGGC TTCCAGGGGC AAGAAAGAGA AGGIXSAGGCA GCTGCAGOGC ACTGAGCATC 1740 

AGGOGGTGCT GGCAGAGCAG AAAGGCCACC TCCTCCTGGA CABTGAOGAS CGGCCCAGTC 1800 

COGAAGAGGA AGAAGGCAAG CACATCCACC TGOGCCACCT GOGCTTACAG AGGACACT6C 1660 

ACAGCATCGA TCTGGAGATC CAAGAOGGTA AACTGGTTGG AATCT6CG6C AGTGTGGGAA 1920 

GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGCX^WSAT GAOGCTTCTA GAGGGCAGCA 1980 

TT6CAATCA0 TGGAACCTTC GCTTATGTGG CCCAGCAGGC CTGGATCCTC AATGCTACTC 2040 

TGAGAGACAA CATOCrOTTT GGGAA6GAAT ATGATGAAGA AAGATACAAC TCTGTGCTGA 2100 

ACAGCTGCTG CCTGAGGCCT GACCT66CCA TTCTTGCCftG CASCQACCT6 AOGGAGATTG 2160 

GAGAGCGT^ A6CCAACCTG AGOGGTGGGC AGOGGCAGAG GATCAGCCTT GCCCGGGCCT 2220 

TGTATAGTGA CAGGAGCATC TACATCCTGG ACX3ACCCCCT CAGTGCCTTA GATGCCCATG 22 BO 

TGGGCAACCA CATCTTCAAT AGTGCTATCC GGAAACATCT CAAGTCCAAG ACASTTCTGT 2340 

TTGTTAOCCA CCAGTTACAO TACCTG6TT6 ACTGTGATGA A6T6ATCTTC ATGAAAGA6Q 2400 

GCIGTATTAC GGAAAGAGGC ACCCATGAGG AACTGATGAA TTTAAATGGT GACTATGCTA 2460 

CCATTTTTAA TAACCTGTTG CTOGGAGAGA CAC0GCCA6T TGAGATCAAT TCAAAAAAGG 2520 

AAACCAGTGG TTCACAGAAG AAGTCACAAG ACAAGGGTCC TAAAACAGGA TCAGTAAAGA 2580 

AGGAAAAAOC AGTAAAGCCA GAGGAAGGGC AGCTTGTGCA GCTGGAAGAG AAAGGGCAGG 2640 

GTTCAGTGCC CTGGTCAGTA TATGGTGTCT ACAT0CAG6C T6CIt36GG6C CCCTTGGCAT 2700 

TCCTGGTTAT TAT6GCCCTT TTCATGCTGA ATCTAGOCAG CACOGCCTTC AGCACCTGOT 2760 

GGTT6A6TTA CTGGATCAAG CAAGGAA6CG GGAACACCAC TGTGACT06A GGGAACGAGA 2820 

CCTCGGTGAG TGACAGCATG AAOGACAATC CTCATATGCA GTACTATGCC AGCATCTACG 2880 

CCCTCTCCAT GGCAGTCATG CTGATCCTQA AAGCCATTOG AGGAGTTGTC TTTGTCAAGG 2940 

GCAOGCTGCG AGCTTCCTCC OOGCTGCATG AOSAGCTTTT CGGAAGGATC CTTOGAAGCC 3000 

CTAT6AAGTT TTTT6ACACQ ACCCCCACAG GGAGGATTCT CAACAGGTTT TCCAAAGACA 3060 

TGGATGAAGT TGAOGTGCGG CTGCCGTTCC AGGCOGAGAT GTTCATCCAG AAOGTTATCC 3120 

TGGTGTTCTT CTGTGTGGGA ATGATCGCAG GAGTCTTCCC GTGGTTCCTT GTGGCAGTGG 3180 

GGCCCCTTGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGGGTCCTG ATTCGG6AGC 3240 

TGAAGGGTCT G6ACAATATC AGGCAGTCAC CTTTCCTCTC OCACATCACG TCCAOCATAC 3300 

AGGGCCTTGC CACCATCCAC GCCTACAATA AAGGGCAGGA GTTTCIGCAC AGATACCAGG 3360 

A6CTGCTGGA TGACAACCAA GCTCXTTTTTT TTTTGTTTAC 6TGTGCGAT6 0GGTG6CTGG 3420 

CTGTGOGGCT GGACCTCATC AGCATCGCCC TCATC3lCXaU: CACGGGGCTG ATGATOGTTC 3480 

TTATGCACGG GCAGATTCCC CCAGCCTATG CGGGTCTOSC CATCTCTTAT GCTGTOCAGT 3540 

TAAOQGGGCT GTTCCAGTTT A0G6TCAGAC TGGCATCTGA GACA6AA6CT OGATTCACCT 3600 

CGGTGGAGAG GATCAATCAC TACATTAAGA CTCTGTCCTT G6AAGCACCT GCCA6AATTA 3660 

AGAACAAGGC TCCCTCCCCT GACTGGCXICC AGGAGGGAGA G6TGACCTTT GAGAAOGCAG 3720 

AGATGAGGTA CCGAGAAAAC CTCCCTCTTG TCCTAAAGAA AGTATCCTTC ACGATCAAAC* 3780 

CTAAAGAGAA GATTGGCATT GTGGGG06GA CAGGATCAG6 GAAGTCCTOG CTGGGGATGG 3840 

OOCTCTTCGG TCTGGTGGAO TTATCTGQAG OCTOCATCAA GATTGATGGA GTGA6AATCA .3900 

GTGAtATTGG CCTTGCCX3AC CTC0GAA6CA AACTCTCTAT CAtTCCTCAA GA6C0QGTGC 3960 

TGTTCAGTGG CACTGTCAGA TCAAATTTGG ACCCCTTCAA CCAGTACACT GAAGAOCAGA 4020 

TTTGGGATGC CCTGGAGAGQ ACACACATGA AAGAATGTAT TOCTCAGCTA CCTCTGAAAC 4080 

TTGAATCTGA AGTQATGGAG AATGGGGATA ACTTCTCAGT GGGG6AA0GG CAGCTCTTGT 4140 

GCATA6CTA0 AOOCCTGCTC OGOCACTGTA A6ATTCTGAT TTTAGATGAA GCCACAGCTQ 4200 

OCATGQACAC AGAGACAGAC TTATTGATTC AAGAGACCAT OOGAGAAGCA TTTGCAS^CT 4260 

GTAOCATQCT GACCATTGCC CATCGCCTGC ACAOGGTTCT AGGCTCCGAT AGGATTATGG 4320 

TGCTGGCCCA GG6ACAGGTG GTGGAGTTTG ACACCCCATC GGICCTTCTO TCCAA06ACA 4380 

GTTCC06ATT CTATGCCATG TTTGCIGCTG CAGAGAACAA GGTOGCTGTC AA6G6CTQAC 4440 

TCCTCCCTGT TGAOGAAGTC TCTTTTCTTT AGA6CATT6C CATTCCCT6C CTGGGGOGGG 4500 

CCCCTCATOQ CGTCCTCCTA COGAAACCTT OOCTTTCTOG ATTTTATCTT TC GCACA GCA 4560 

OTTCOGGATT GGCTTGTGTQ TTTCACTTTT AGGGAGAGTC ATATTTTGAT TATTGTATTT 4620 

ATTCCATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 

G6GAACCGTT ATTATAATTG TATCAGAG6C CTATAATGAA GCTTTATACO TGTAGCTATA 4740 

TCTATATATA ATTCTGTACA TAGCCTATAT TTACAGtGAA AKTGTAAGCT GTTTATTTTA 4800 

TATTAAAATA AGCACTGTGC TAATAACAOT 6CATATTCCT TTCTATCATT TTTGTACAGT 4860 

TTGCTGTACT A6AGATCTG0 TXTPGCTATT AGACTGTAGG AAGAGTAGCA TTTCATTCTT 4920 

CTCTAGCTGG TGGTTTCAOG GT6CCAGGTT TTCTG6GTGT OCAAAGGAAG A0GTGTG6CA 4980 

ATA6TGGGCC CTCOSACAGC CCCCTCTGCC GCCTCCCOVC AGCOGCTCCA GG0GTG6CTG 5040 

GAGAOGGGTG GGCGGCTGGA GACCATGCAG AGOSCCGTGA GTTCTCAGGG CTCCTGCCTT 5100 

CTGTCCTGGT GTCACTTACT GTTTCTGTCA GGAGA6CAGC QQGGCX3AAGC CCAGG OCCC T 5160 

TTTCACTCCC TGCATCAAGA ATGGGGATCA CAGAGACATT CTTCCGAGCC GGGGAGTTTC 5220 

rrrcL'iiK x ri' tcttcttttt gctgttgttt ctaaacaaga atcagtctat ccacagagag 52 80 

TOCXACIGGC TCAGGTTCCT ATGGCTGGCC ACTGCACAGA GCTCTCCAGC TCCAAGACCT 5340 



265 
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GTlWriXX a AGCCCTGGAQ CXaACTGCTG CTTTTTGAGO TGQCACTTTT TOITTTGCCT 5400 

ATTOCCACAC CTCCACAfiTT CAGTG6C3W» GCTCAOCSATT TCCTGGCTCT mTTCCTTT 5460 

CTCACCGCAO TCGTOGCACA GTCTCTCTCT CTCTCTCCCC TCAAAG TCTG CAACTTTAAG 5520 

CAGCTCTTGC TAATCAGTGT CTCACACrGG OGTAGAAGTT TTTGTACTGT AAAGAG ACCT 5580 

ACCTCAGGTT GCTGGTTGCT GTGTGGTTTG GTGTGTTCCC GCAAACCCCC TTTGTGCTGT 5640 

GGGGCTOGTA GCTCAGGTGG GO0T6GTCAC TCCTCTCATC AGTrGAATGG TCAGCGT TGC 5700 

ATCTCGTCftC CAACTAGACA rrCTGTOJCC TTAGCATGTT TOCTGAACAC CTTGTGGARG 5760 

CAAAAATCTG AAAATCTGAA TAAAATTATT TTQGATTTTG TAAAAAAAAA AAAAAAAAAA 5820 
AAAAAAAAAA AAAAAAAA 

Seq ID KO: 194 Protein sequence t 
Protein Accession #t NP_005679.1 



1 11 21 31 41 51 

MKDIDIGKEY ilPSPGYESV RERTSTSGTH RDREDSKPRR TRPLBCQDAL ETAARABGIiS 60 

LDASMHSQLR lUJBEHPKGK YHHGLSALKP IRTTSKHQHP VDNAGLPSCM TPSWLSSLAH 120 

VAHKKGELSM EDVHSLSKHB SSDVNCRRLB RLWQEELKBV GPDAASLRRV V^nFCRTRLI 180 

LSIVCUtlTQ LAGPSGPAFM VKHLLEYTQA TESNLQYSLL LVLGLLLTBI VRSWSLALTM 240 

ALNYRTCVEL RGAIL™AFK KIUCLKNIKE KSLGELINIC SNDGQHMPBA AAVGSLLACG 300 

PWAIIjGMIY NV1IU3PTGP LGSAVFILFY PAMMPASRLT AYFRRKCVAA TDERVQKMNE 360 

VLTYIKFIKM YAHVKAPSQS VQKIRBBBRR ILEKAGYFQG ITVGVAPIW VIASWTPSV 420 

HMTIiGFDLTA AQAPTWTVF HSMTPALKVT PFSVKSLSBA SVAVDRFKSL PLMEBVHMIK 480 

NKPASPHIfCE BflKNATUVWD SSHSSIQNSP K1.TPKKKKDK KASRGKKEKV RQLQRTEHQA 540 

VLABQKGHLL U3SDERPSPB EEEGKHIHLG HLRLQRTLHS IDLEIQBGKL VGICGSVGSO 600 

KTSLISAIW QMTLLBGSIA ISGTFAYVAQ QAWILNATLR DNILFGKEVD BERYNSVUfS 660 

CCLRPDIAIL PSSDLTBI6E HGANLSGGQR QRISIAHALY SDRS IYIUP PLSAU)AHVG 720 

NHtFNSAIBK HLKSKTVIJV THQLQYLVDC DEVIPMKBGC ITERGTOEEL MNLNGDYATI 780 

FNNI.LLGBTP PVEINSKKBT SGSQKKSQDK GPKTGSVKKE KAVKPEBGQL VQLEEKGQGS 840 

VPWSVYGVYI QAAGGPLAFL VIMALEMUIV GSTAPSTWWL SYWIKQGSOI TTVTRGNETS 900 

VSDSMKDNPH MQYYASIYAIi SMAVMLILKA IRGWFVKGT LRASSRLHDB LFRRILRSPM 960 

KFPDTTPTOR lUTRPSKDMD BVDVRLPPQA BHFIQNVILV FPCV041AGV PPWFLVAV6P 1020 

LVIIiFSVLHI VSRVLIRELK RLDMITQSPF I.SHITSSIQG LATIHAYNKG QBFUffiYQEL 1080 

LDDNQAPFFL FTCAMRWIAV RLDIiISIALl TTTGLMIVLM HGQIPPAYAfi LAISYAVQLT 1140 

GLFQFTVRLA SffTEARFTSV BRINHYIKTL SLEAPARIKN KAPSPDWPOB GBVTFESAEM 1200 

RYREaiLPLVL KKVSPTIKPR EKIGIVGRTG SGKSSW3MAL FRI*VELSGGC IKIDGVRISD 1260 

IGLADLRSKL SIIPQEPVLF SGTVRSNLDP PMQVTBDQIW DALERTHMKE CIAQLPLKLE 1320 

SBVMEaKHlNF SVGBRQLIiCI ARALLRHCKl WIBBATAAM DTBTDLLIQB TIREAPADCT 13 BO 
MLTlAHRUrr VLGSDRIMVL AQGQWEFDT PSVLLSSDSS RPYAMPAAAB NKVAVKG 

Seq ID NO I 195 mm sequence 
Hucleic Acid Accession 4« KN_006470 
Coding sequence: 228.. 1922 



1 11 21 31 41 51 

GCTGTCCTGA G0CT6AGTAC TCTAGCTOCC TTOT0GCX31T OGCATCTGGC TOCCATCCAG 60 

06CCAGCACA CAGTAA-TOAG TGOCOSAGCT TCCTCTGGGA GGGAGGAAAC AOTTAAAATC 120 

TTGCAGCAGC TGCAATCATC TAGGCGTGGT TCTCTTGTCT GACTTGGGCT GCACAGATCC 180 

TGGGCCAAGG GACAGAAGAA AGACAGCCTA GGAGCAGAOC CTCCCAGATG GCTGAGTTGO 240 

ATCTAATGGC TCCAGG6CCA CTGCCCAGGG CCACTOCTCA GCCCCCAGOC CCTCTCAGCC 300 

CAGACTCTGG GTCACCCAGC OCAGATTCTO GGTCAGCCAG CCCAGTQ6AA QAAOAOQAOG 360 

TGGGCTCCTC GGAGAAGCTT GGCAGGGAGA CGGAGGAACA GGACAG CGAC TCTGCAGAGC 420 

AGGGGGATCC TGCTGGTGAG GGGAAAGAGG TCCTGTGTGA CTTCTGCCTT GATGACACCA 480 

(aAGAGTGAA GGCAGTGAAG TCCTGTCTAA CCTGCATGGT QAATTACrGT GAAGAGCACT 540 

TOCAGCCGCA TCWSQTGAAC ATCAAACTGC AAAGCCACCT 0CIGAC0GA6 OCAGTGAAOa 600 

ACCACAACTG GCGATACTGC CCTGCCCACC ACASCCCACT GTCT6CTTTC TGCTGCCCTG 660 

ATCAGCAGTO CATCTGCCAG GACTGTTGCC AGGAGCACAG TGGCCACACC ATAGTCTCCX: 720 

TQ6ATGCAGC COSCAGGGAC AAGGAGGCTO AACTCC3«3TG CACCCAGTTA QACTTGGAGC 780 

GGAAACTCAA GTTGAATGAA AAT6CCATCT CCAGGCTCCA GGCTAACC3A AAGTCIGTTC 840 

TGGTGTC6GT GTCAGAGGTC AAAG0GGTG6 CTGAAAT6CA GTTTeGOGAA CTCCTTGCT6 900 

CtGTGAGGAA GGCCCAGGCC AATGT6ATGC TCTTCTTAGA 66A6AAGGAG CAAOCTGOGC 960 

TGAGCCAOOC CAA06GTATC AAGGCCCACC TGGACTACAG GAGTGCOGAQ ATGGA6AAGA 1020 

GCAAGCAGGA GCTGGAGAGG ATGGCGGCCA TCAGCAACAC TGTCCAQTTC TTGGAG6AGT 1080 

ACT6CAAGTT TARGAACACT GAAGACATCA CCTTCCCTAG TGTTTAOGTA GGGCT6AAGG 1140 

ATAAACTCTC GGGCATCCGC AAAGTTATCA OGGAATCCAC TOTACACTTA ATOC»GTTGC 1200 

TGGAGAACTA TAAQAAAAAG CTCCAGGAGT TTTCCAA6GA AGAGGAGTAT GACATCAGAA 1260 

CTCAAGTGTC TGCCXSTTGTT CAGCGCAAAT ATTGGACTTC CAAAOCTGAO COCAGCACCA 1320 

GGGAACAOTT CCTCCAATAT GOGTATGACA TCACXTrTTGA CCCGGACACA GCACACAAGT 1380 

AJCrOCGGCT GCAGGAGGAG AACOGCAAGG TCACCAACAC CACGCCCTGG GAGCATCCCT 1440 

AOCCGGACCT CCCCAGCAGG TTCCTOCACT GG06GC3U3GT GCTGTOOCAO CAGA GTCTGT 1500 

ACCTGCACAG 6TACTATTTT GAGGTGGAGA TCTT0G6GGC AGGCACCTAT GTTGGCCTGA 1560 

QCI6CAAA0Q CATGOACOGO AAAGGGGAGO AGOGCAACAG TTGCATTTCC GGAAACAACT 1620 

TCTCCTOGAG CCTCCAATC6 AACGGGAAGQ AGTTCAOGGC CTGGTACAGT GACATGGAGA 1680 

CCCCACTCAA AGCTGGCCCT TTCCGGAGQC TCGGGGTCTA TATOSACTTC CCGGGAGGG A 1740 

TCCTTTCCTT CTATGGOn'A GAGTATGATA CCATGACTCT GGTTCACAAO TTTGOCTGCA IBOO 

AATTTTCAGA ACCAGTCTAT GCTGCCTTCT GGCTPTCCAA GAAGGAAAAC GCCATCCGGA 1S60 

TTG1MATCT GGGAOAGGAA CCOGAGAAGC CAOCACCGTC CTTOGGGGTG ACTOCTCCCT 1920 

AGACTCCAGO AOOCATATCC CAGACCTTT6 CCAGCTACAO TGATGGGATT TGCATTTTAG 1980 

GGTGATTTGT GGGCAGAAAT AACTGCTGAT GCTAGCTGGC TTTT6AAATC CTATO6GGTC 2040 

TCTQAATCAA AACATTCTCC AGCTOCTCTC TTTTOCTCCA TATGGTGCTQ TTCTCTATGT 2100 

GTTICCAGTA ATrCTTTTTT TTTTTTrPGA GAOGGAGTCT OGCACTGTTG CCCAGGCTGO 2160 

A6A6CAGXG0 OGOOATCTCG 0CTCACT6CA AGCTCOQCCT COCSOAGTrCA AGCAATTCTC 2220 

CTGOCTCAGC CTCCOBABTA GCTGGGATTA CAOQIGCCTO C3CACCACACC CAGCTAATGT 2280 

TTTCTATTTT TAfltAQAfiAT 0006TTTCAC CATOTTGGOC AGGCAGATCT CAAACTCCT6 2340 
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AOCTGCrrGAT GCACOCACCT GGGOCTOCCA AAGTGCXGGG ATTACMGOQ TGAOOCACZO 2400 
CGOCCTGCCT GTTTGTAGTA ATTTTTASGC ACCAAATCTC OCTCATCTTC TAGTGGCATT 2460 
CTOCTCTCTQ TTCA80TAAA TQTGACACTG TGCCCAGAftT GGATGACCAO 6AA0CTTAAA 2520 
GAGTGGCTQA AAAOATTGCA 6AGTTATCAT AATAAATTGC TAACTTGGGT 



Seq ZD NO I 196 Protein sequence i 
Protein Accession g: KP_006461 

1 11 21 31 41 51 

I I I I I I 

KAEU)XMAP6 PLPRATAQPP APLSPDS6SP SP0S6SASPV EEEDVGSSEK IiGRETEEQDS 60 

DSAEQC2>PAG EGKEVLCDFC LDDTRRVKAV KSCLTOWNY CEEHLQPHQV NIKLQSHLLT 120 

EPVKDHNWHY CPAHHSPIiSA FCCPDQQCIC QDCCQEHSGH TIVSLDAARR DKEAELQCTQ 180 

U3IiERXLKLH ENAISRLQAN QKSVLVSVSE VXAVASiIQFG BLLAAVRKAQ ANVMLFLEBK 240 

EQAALSQANG IKAHLBYRSA EMEKSKQELE RMAAISNTVQ FLSEYCKFKN TEDITPPSW 300 

VGLKDKLSGI RKVITESTVH LIQLLENYKK KLQEFSKBBE YDIRTQVSAV VQRKYWTSKP 360 

EPSTRBQPLQ YAYDITFDPD TAHBnfI«RLQE ENRKVTKTTP MEHPYPDLPS RFLHWRQVLS 420 

QQSLYLHRYY FEVEIPGAGT YVGLTCKGID RKGEERNSCI SGUKPSWSLQ HKGKEFTAWY 480 

SDMETPLKAG PFRHLGVYID PPGGILSFYG VBYDTMTLVK KFACKFSEPV YAAFWLSKKB 540 
NAZRZVDLGB EPEKPAPSLG VTAP 

Seq ID HOi 197 IHCA sequence 
Nucleic Acid Accession #t NM_004316 
coding sequence: 433-1149 

1 11 21 31 41 51 

I I I I I I 

CCCGAGACCC GGGGCAAGAG AGCGCAGCCT TAGTA6GA6A GGAA0GC6AG AC38CQ6CAGA 60 

GCGCGTTCAG CACIGACTTT TGCTGCTGCT TCTGCTTTTT TTTTTCTTAO AAACAAGAAG 120 

G0GCCA606G CAGCCTCACA CX3CGAOG6CC ACGCGAGGCT CCOGAAGCCA ACCOGCGAAG 180 

GGAG6AGGGO AGGGAGGAGG AGGCXKX3GT6 CAGGGAGGA6 AAAAAGCATT TTCACCTTTT 240 

TTOCTOOCAC TCTAAGAAGT CTCOOSQO G A TTTTGTATAT ATTTTTXAAC TTCXX3TCAG6 300 

GCTCCOGCTT CATATTTCCT ■ riTCn 'TCCC TCTCTGTTCC TGCACCCAAO ITCTCTCTOT 360 

GTCCCCCTCG OGGGOCCCGC ACCTCGCGTC CCXR3ATCGCT CTGATTCCGC GACTCCTTGG 420 

CCGCCGCTGC GCATGGAAAG CTCTGCCAAG ATGGAGAGCG GCOGCXKrCGG CCAGCAGCXTC 480 

• CAGCCGCAOC CCCAGCAGCC CTTCCTQCOG CCXX3CAGCCT GTTTCTTTGC a«3SGCCGCA 540 

GOCQOQGOGG CC6CAGC0GC 06CAG0GGCA GOGCAGAGOS GGCAGCAGCA GCASCAGCAG 600 

CAGCAGCAGC AGCA6CAGCA GCAG6C6C06 CAGCTGAGAC G6G0Q6CCGA G60CCAGCCC 660 

TCAGGGGGCG (JTCACAAGTC AG0GCCC3\AG CAAGTCAAGC GACAGCGCTC GTCTTCGCCC 720 

GAACTGATGC GCTGCAAAOG CCGGCTCAAC TTCAGOGGCT TTGGCTACAG CCTGCCGCAG 7B0 

CAGCAGCOGG CX3G006TG6C GCGCCGCAAC GAGCGOGAGC GCAACCGCGT CAAGTTGGTC 840 

AACCTOGGCT TTOCCACCCT T0QG6AGCAC GTCC0CAA06 GCXKXS60CAA CAAGAAGA70 900 

A0TAAGGT6G AGACACTGOG CTCG60GGTC GAGTACATCC G06G6CTGCA GCAGCTGC7G 960 

GACGAGCATG ACGCGGTGAG OGCCGCCTTC CAGGCAGGOG TCCTGTCGCC CACXATCTCC 1020 

CCCAACTACT CCAAOGACTT GAACTCCATG GCCGGCTCGC OGGTCTCATC CTACTCGTCG 1080 

GACGAOGGCT CTTACGACCC GCTCA600CC GAGGAGCAQG AGCT TCTCG A CTTCACCAAC 1140 

TGGTTCTGAO GGGCTOQGCC TGGTCAGGOC CTGGT6CGAA TG6ACTTTGG AAGCA66GTG 1200 

ATGGCACAAC CTGCATCTTT AGTGCTTTCT TGTCAGTG6C GTTGG6AGGG G6AGAAAAGG 1260 

AAAAGAAAAA AAAAGAAGAA GAAGAAGAAA AGAGAAGAAG AAAAAAACGA AAAGAGTCAA 1320 

CCAACCCCAT CGCCAACTAA GCX3AGGCATG CCTGAQAGAC ATGGCTTTCA GAAAAOGGGA 1380 

AGCGCTCAGA ACAGTATCTT TGCACTCXIAA TCATTCAOQG AGATATGAAG AGCAACTGGG 1440 

ACCTGAGTCA ATGCGCAAAA TGCAOmGT QTGCAAAAGC AGT6GGCT0C TGGCAGAAGO 1500 

GAGCAGCACA OQOGTTATAO TAACTCCCAT CACCTCZAAC AC6CACA6CT GAAAGTTCTT 1560 

GCTOSGGTCC CTTCACCTCC C06CCCTTTC TTAGAGTGCA GTTCTTAGCC CTCTA6AAAC 1620 
GAGTTGGTGT CTTTC 



Seq ID HO I 196 Protein sequence: 
Protein Accession ft: NP_004307 

1 11 21 31 41 51 

I t t I I I 

I4ESSAKMESG GAOQQPQPQP QQPFLPPAAC PPATAAAAAA AAAAAAAQSA QQQQQQQQQQ 60 

QQQQAPQLRP AADGQPSGGG HKSAPKQVKR QRSSSPELHR CKRRLNFSGF GYSLPQQQPA 120 

AVARRtlERER MRVKLVNLGF ATLREHVFMG AAKKKMSKVB TLRSAVEYIR ALQQLIiDEBD IBO 
AVSAAFQAGV LSPTI8PHYS NDLNSMAGSP VSSYSSDG6S YDPLSPEBQB LLDFTNHF 

Seq ID IVO: 199 DHA sequence 
Nucleic Acld Accession ft: NH_00701S 
Coding sequence: 1-1005 " 

1 11 21 31 41 51 

I I I I I I 

ATGACAGAGA ACTCOGACAA AGTTCCCATT 6CCCTGGTGG GACCTGATGA OSTGGAATTC 60 

TGCAOCCCCC CGGCGTACGC TACGCTQACX5 GTGAAGOCCT CCAGCCCOGC GOGGCTGCTC 120 

AAGGTGGGAG CCG7GGTCCT CATTT06GGA 6CTGTGCTGC T6CTCTTTGG GGCCATGGGG 180 

GOCTTCIACT TCTQGAAGGG GAQOQACAGT CACATTTACA ATGTCCATTA CACCAT6AGT 240 

ATCAATGGSA AACTACAAGA TG6GTCAATG GAAATAGAOO CT6GGAACAA CTTG6A6ACC 300 

TTTAAAATGG GAAGTGGAGC TGAAGAAOCA ATTGCAGTTA ATGATTTCCA 6AATGGCATC 3 60 

ACAGGAATTC GTTTTGCTGG AGQAGAGAAG TGCTACATTA AAGC6CAAGT QAAGGCTCGT 420 

ATTCCTGAGG TGGGG6CCGT GAOCAAACAG AGCATCTOCT CGAAACTGGA AGGCAAGATC 480 

ATGCCAGTCA AATATGAAGA AA A TTCTCTT ATCTGOaiaO CIGTAQA3GA G OCI G TC AAQ 540 

GACAACAGCT TCrTGAGTTC TAAGGTGTTA QAACTCTQOG OTQACCTTCC TATTTTCTGQ 600 

CTTAAAOCAA CCTATCCAAA AGAAATCCAG AGGGAAAGAA 6A6AAGTG6T AAGAAAAATT 660 

GTTCCAACTA CCACAAAAAG ACCACACAGT GGACCACX3GA GCAAOOCAGG OGCIGOAAGA 720 

CTGAAIAATG AAACCAOACC CAGTGTTCAA GAGGACTCAC AA6CCTTGAA TCCTGATAAT 780 
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CCTTATCATC AGO^GGAAGG GGAAAGCATO ACATTOGACC CTASACrOOA TCACGAAOSA 840 

ATCTGTTGTA TAGAATGTAG GOSGAGCTAC ACCCACTGCC AfiAAGATCTG TGAACCCCTG 900 

GOOGGCTATT ACCCATGGCC TTATAATTAT CAAGGCTGCC 6TTCGGCCTG CAGAGTCATC 960 

ATGCXaTGTA GCTCGTGGGT GGCCXGTATC TTGGGCATGG TGTGAAATCA CTTCATATAT 1020 

CAOGTGCTGT AAAATAAGAA CTAGCTGAAG AGACAACCAA AGAAGCATTA A6GCAG6TTG 1080 

ATGCTCATGG 6ACCATAAAA TATTTTTACA CGCAGCCTGA GOGOTTATTC TTGACACTCT 1140 

TAACAGAATT rTTTTAATOG TTTTCCAGAA CTTTAGTATA TGCAAATGCA CTGAWVGGGT 1200 

AfiTTCAAGTC TAAAATOCCA TAACCCCGTT ATTTGTTATT TTTTATTTGC ATTGATTTGC 1260 

CATAAGTCTT CCCTTGCTTG CATCTTCCAA AGCTATTTOO AAATAAACAC GAAAATTTAC 1320 
AGTTTGCC 



Seq ID NO: 200 Protein sequence: 
Protein Accession S: NP_008946 

1 11 21 

I I i 

MraiSDRVPI ALVGPDDVEP CSPPAYATLT 
ApyPHKGSDS HIVBVHYTMS IKGKLQDGSM 
TGIRFAGGEK CYIKAQIVKAR IPBVGAVTKQ 
DNSPLSSKVL ELCGDLPIPW UCPTYPKBIQ 
LNNETRPSVQ EDSQAFNPDM PYHQQEGESM 
GGYYPWPYKY QGCRSACRVI MPCSMWVARI 



Seq ID nOi 201 DNA sequence 

Nucleic Acid Accession ft: NM_^O00728.2 

CotJing seqpience: 112.. 495 



31 41 51 

t 1 I 

VKPSSPARLL KVGAWLISG AVLLLFGAIG 
BIDAC3SNLET PKMGSGAEEA lAVNDPQNGI 
SISSKLBGKI MPVKYEENSL IWVAVDQPVK 
RERRBWRKI VPTTTKRPHS GPRSNP6AGR 
TPDPRU3HEG lOCIBOUlSY THOQKICBPL 



1 

I 

GTAATAAGAG 
GTC6AC0Q6C 
OGGAAGTTCT 
CAGGGGGOGC 
GAGGACGCGC 
GAGCTGAAGC 
AACACTGCCA 
GTGAAGAGCA 
GACCTTCAAG 
CATATCCTTA 
AAGGA66CAC 
TGGAAGAAGA 
GAGAATAATT 
GGAAACTAAT 
GGTTATTT60 
ACTGTACCAC 
GTATGTAGCA 
AGCATCTATT 
AAATTTTTGT 
TGQATGCAAG 
TTOCTTTTTC 
TOCAATTCAT 
TTGCCTAACT 
TATAGTTTTA 
TGAGAG6TGT 
TTT6TTAAAA 
CTGATCATAT 
ACCATTCm 
ACCA6ATAAT 
CTAAAATATT 
TTCTGATGAQ 
CATATTAATA 
GTTTTCTCTG 
ATATCTTGTT 
ATTTTTOTTT 
AAAAAAAAAA 



11 

I 

CGGGGTCTCC 
06CTGGCGCT 
CCCCCTTCCT 
CATTCAGGTC 
GCCTCCTGCT 
AGGAGCAGGA 
CCTGTGTGAC 
ACTTOGTGCC 
CCTGAGCAGA 
TAAGAGATTC 
AAGCCAAGGA 
GCAGCCCTGC 
TCTGTTQTTT 
ACAATACATT 
AAAGTGTGTA 
TTCGCCTTCT 
GTATCTCATT 
TTACCATAT6 
TGGCTTGCrr 
ATTGTTTTCA 
ATTTTCTTAG 
CTTTTTTTTT 
AAGGTCCCAA 
TATTTTATAT 
AGGTTGAAAT 
AGACTGTTAT 
TTGTGTGGGT 
TGCCAATGTC 
GTGGGTCTAC 
TTCTACATCT 
ATTTTTAATG 
ATATTAAGTC 
TTTTTTTTTT 
AGATTTTTAA 
TTAATTGTTC 
AAAAAAAAAA 



21 

1 

GCGGGGAAGG 
6CCCTGAAAC 
06CTCTCABT 
TGCOCTGGAG 
GGCTGCACTG 
GACACAGGGC 
TCATCGGCTG 
CACC3AT6T6 
TGAATGACrC 
ACTCAGAAGA 
AGTCTGTGTC 
TGACACCTAG 
TAAGCCACAA 
TTCATTTATT 
TTTAACTCTO 
TGCCAGCCAC 
GCTGTTTTAA 
TTTATCACCT 
GCTTTATTA6 
GATATATAGT 
CAGTGTCTCT 
CTTTTATGTA 
GGTCACAATA 
GTAGATTAGT 
TCATACCTGT 
TTCACC ATTT 

atatttctgg 

ATACTGOCTT 

cfjijckTiXan' 

TTTATACATT 
G6ATTGTGTT 
GTTCAATTCA 
TTTAACAGTO 
CTATTTTATT 
ATT6CTAGTA 
AAAAAAAAAA 



31 

I 

CGCCCACAGC 
TCTAGICGCC 
ATCTTGGTOC 
A6CAGCCCAG 
GTGCAGGACT 
TCCAGCTCCX3 
GCAGGCTTGC 
GGTTCCAAAG 
CAGGAAGAA6 
CACATGTGGA 
TACCAGAAGC 
AGTTTG6ACT 
AGTTTGT6GT 
TTGGGTAAAT 
TAAGAAACTG 
ATATGAGAGC 
TTTGTATTTC 
TTATTOAAGO 
TQTT6AGTTT 
TTGGAAACTT 
CACAGAGAAA 
TTGTGCTTTT 
ACCTTATTCT 
GATCTATTTT 
GAATATAGAT 
AATTGCCCCT 
GTTCTCAATT 
GATTAGTGTA 
CATTCTTGTT 
TTAGAATCAG 
AAATCAGTG6 
TGAAC ACAAT 
TTCTCAGTTT 
TTTT OG T QC T 
OATAGAAATA 



41 

1 

AGGTGTGGTG 
AGAGAGGOGG 
TGTACCAGGC 
ACCOGGCCAC 
AT6TGCAGAT 
CTGCCCAGAA 
T6AGCAGATC 
CCTTTGGCA6 
GTGTGTCCTA 
GAAGGTGACA 
CAGAATCACA 
TCCAGCTTCC 
AATTT6TTAT 
GCCTTGGAGT 
CCAAACTATT 
TCTAGTATTT 
CCCAATGACT 
GTCTGTTTAA 
TTAGAGCTCT 
CCTTCCCCTG 
AAGTTGTAAT 
A6TTCAT6TC 
ATACTTTCTT 
GAGTTAATTT 
ACXX»ATTGT 
GCACCTTTGT 
CTGTCTCATT 
GTOVTAAAGT 
CAAAAAGATT 
TGTGTTACTA 
GTTAATTTTG 
ACAT6TTTTC 
TCAACAGAAA 
AATGTAAATG 
OUITATTTAA 



51 

I 

TTCATCCOGG 
CATGGGTTTC 
GGGCAGCCTC 
ACTCAGTAAA 
GAAGGCCAGT 
GAGAGCCIGC 
AGGGGGCATG 
6CX3COSCAGG 
AATCCAATGA 
T6ACAGA0GC 
GAACAGTCTC 
AGAACTGTGA 
GACAG0C3CTA 
GOGATTGCTG 
TTCTQAAGTG 
CCACAAATAG 
AATGACGTTG 
ATCTTCTOCT 
TTATATQTTG 
AATCTGCGGA 
TTGAATAAGA 
TAAGAACTCT 
GTAAAAGTTT 
TTGTATAA66 
TTCAGTOCCA 
CAAAAAGCAA 
GATTGATTTG 
GAATCTCAAA 
TTAGCTACAT 
TCTACAAAAT 
6GAGAATTAG 
ACTTATTTAG 
TATTCTACAC 
GTACTTAAAC 
AATATTAG6A 



Seq ID NO: 202 Protein sequence: 
Protein Accession ft*. NP_000719.1 



60 
120 
180 
240 
300 



60 

120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 



11 



21 31 41 51 

MGFRKPSPPL ALSILVLYQA GSLQAAPFRS ALESSPDPAT LSKBDASLLL AALVQDYVQM 
KASELKQBQE TQGSSSAAQK RACNTATCVT HHIAGLIiSRS QGMVKSNFVP TOVGSKAPGR 
StRBDLQA 



60 
120 



Seq ID NO: 203 DNA sequence 
Nucleic Acid Accession 8: NM_001741 
Coding sequence: 71.. 4 96 



11 



21 



31 41 51 

i i < 

CTCTGGCTGG ACGCCGCCGC OGCOSCTGCX; ACCGCCTCTG ATCCARGCCA CCTCCOGCCA 
GAGAGGTGTC ATGGGCTTCC AAAAGTTCTC CCXXTTCCTG GCTCTCAGCA TCTTGGTCCT 
OTTOOWSGCA GGCA6CCTCC ATSCAOCACC ATTCAG6TCT GCCCTGGAGA GCAGCCX3U3C 
AGACCC06CC ACGCTCASTO AGGAOGJAGC GCGCCTCCTG CTGGCTGCAC TG6TQCAGGA 
CTATOTOCAQ ATGAAGGCCA GTGAGCrOGA GCAGGAGCaA GAGAGAGAGG GCTCCAGCCT 



60 
120 
180 
240 
300 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 
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GGACAGCCCC AGATCTAAfiC GGT6GSGTAA TCTGABTACT TGCATGCTGO 6CACATACAC 
GCAGGACTTC AACAAGTTTC ACAOGTTCOC GCAAACTGCA ATTGGG6TT6 C»GCACCTGG 
AAAGAAAAGG GATATGTCCA GCGACTTGGA GAGAGAOCAT CGCCCTCATG TTAGCATGCC 
CCAGAATGCC AACTAAACTC CTCCCTTTCC TTCCTAATTT CCCTTCTTGC ATCCTTCCTA 
TAACTTGAT6 CATGTGGTTT GGTTCCTCTC TGGTGGCTCT TTGC36CTGGT A3TGGTGGCT 
TTCCTTGTGG CAGAGGATGT CTCAAACTTC AGATGGQAQG AAAGAGAGGA GGACTCACAG 
GTTG6AAGA6 AATCACXTTGO GAAAATAOCA GAAAA3GA0G GC06CTTTGA GTGOCCCAGA 
GATGTCATC A GAGCTCCTCT GTCCT6CTTC TGAATGTGCT GATCATTTGA G6AATAAAAT 
TATTTTTOCC C 



Seq ID NO: 204 Protein sequence: 
Protein Accession fti NP_001732 

I 11 21 31 41 51 

I I I.I I I 

MGFQKFSPFL ALSILVLLQA GSLBAAFPSS ALESSPADFA TLSEDEARUi LAALVQDYVQ 
MKASBLEQBQ ERBGSSLDSP RSKRCGMLST OOiCmODF HKFBTPPQTA IGVGAPGKKR 
OMSSDIiERDH RFBVSMPQNA H 

Seq ID NO: 205 DNA sequence 
NUcIeic Acid Accession #: NM_005361 
coding sequence: 1-945 " 



PCTAJS02/12476 



AT6CCTCTTO 
GAGGCCCTGG 
TCCTCTTCTA 
CCTCCCCACA 
AGACAATCCG 
CT6GA0TCC6 
CTCCTCAA6T 
AGAAATTGCC 
GTCTTTGGCA 
TGCCTGGGCC 
CTCCTGATAA 
ATCTGG6AGG 
CATCCCAGGA 
GTGCCOGGCA 
ACCAGCTATG 
TACCCACCCC 



11 
I 

AGCAGAGGAG 

CTCTAGT66A 
GTCCTCAGGG 
ATQAOGGCTC 
AGTTCCAAGC 
AT0GA6CCAG 
A06ACTTCTT 
TOSAGGTGGT 
TCTCCTACGA 
TCX5TCCTGGC 
AGCT6AGTAT 
AGCTGCTCAT 
GTGATCCTGC 
TGAAAGTCCT 
T6CATQAA0G 



21 
I 

TCAGCACTGC 
TG06CAGGCT 
AGTTAOCCTG 
AGCCTCCAGC 
CAGCAACCAA 
AGCAATCAGT 
GGAfiCCGGTC 
TCCC6TGATC 
GGAAGTGGTC 
TGGCCT6CTG 
GATAATOGCA 
GTTGGAGGT6 
GCAAGATCT6 
ATGCTAOGAG 
GCACCATACA 
GGCTTTGAGA 



31 
I 

AAGCCTGAAG 
CCTGCTACIG 
GGOGAGGTQC 
TTCTOGACTA 
GAAGAGGAGG 
AGGAAGATGG 
ACAAAG6CAG 
TTCAGCAAAG 
CCCATCApCC 
GGOOACAATC 
ATAGAGGGCX5 
TTTGAGGGGA 
GTGCAGGAAA 
TTCCTGTGGG 
CTAAAGATCG 
GAGGGAGAAG 



41 
I 

AAGGCCTTGA 
AGGAGCASCA 
CTGCTGCOGA 
CCATCAACTA 
GGCCAAGAAT 
TTGAGTTGGT 
AAkTGCTGGA 
CCTCOGAGTA 
ACTTGTACAT 
AGGTCATGCC 
ACTGTGCCCC 
GGGAGGACAG 
ACTACCTGGA 
GTCCAAGGGC 
GTGGAGAACC 
AGTGA 



51 

I 

GGCCCGAG6A 
GACOGCTTCT 
CTCACCGAGT 
CACTCTTTGQ 
GTTTOCCGAC 
TCATTTTCTG 
6AGTGTCCTC 
CTTGCAGCTG 
CCTTGTCACC 
CAAGACAGGC 
TGAGGAGAAA 
TGTCTT06CA 
GXACOGGCAG 
OCTCATTQAA 
TCACATTTCC 



Seq ID NO: 206 Protein sequence: 
Protein Accession St NP_00S352 

1 11 21 

1 I I 

MPLBQRSQHC KPBBGIiBASG EALGIiVGAQA 
PPHSPQGASS PSTTINYTLW RQSDEGSSMQ 
LLKyRAREPV TKAEMLESVL RNCQDFFPVI 
OiGLSYDGLL GDNQVMPKTG UiIIVLAIIA 
HPRKXiUIQDL VQENYlfYRQ VPGSDPACyB 
YPPLHERALR B8EB 



31 41 51 

I I I 

PATEBQQTAS 8SSTLVBVTL QEVPAADSPS 
EEEGFRMFPD LESEFQAAIS RKMVELVHFL 
PSKASEYLQL VFGIBWBW PISHLYILVT 
IB6DCAPEEK INEEZiSMLEV FEGREDSVFA 
FLWGFRALXB TSYVKVLBHT UCIGGSPaXS 



6eq ID NO: 207 DNA sequence 
Nucleic Acid Accession #: NN_021115 
Coding sequence: 743-2893 



AAAGGAAG6G 
G6CACC60CC 
CC CAAAC TAA 
CCCTTTGGGT 
GCACCCTGAA 
0GG06AGCTG 
ACOGCIGCTT 
TTOGCTCAAG 
CAGTGTCCAA 
GAGGGAGAAG 
AGAAGTGCCC 
GCAAATCTCC 
ACCCGGGGAO 
QGCCCTGATO 
GACCACTACC 
CTGCAGTGTG 
GCCCCTCAAC 
GGAGCTCCAG 
GGAC6GCCCT 
CCGAAGCCCC 
GACCTTCCAG 
CTCTGGGGAT 
CCTGGGCTAT 
CTGGAGCAjQC 
CATCG6C0GC 
CTGGACGATT 



11 

I 

AG6GAGGGAG 
TTAGGAGG6C 
CTGGT8TCTT 
CCTTACCTCC 
GAGAGAGTG6 
GTGCTSGATG 
0CAGAGGAG6 
CAGGTGAACT 
AGQGCAGGGT 
CCTGGCCCAC 
CTTTGGCTGG 
CCCTTCACTT 
CCTGGGCCTG 
GACAAAGGTG 
TCCAOCATTA 

AGcnvrccA 

AACTTTCTGG 
GT6AAGAGTQ 
ACCCTQACOG 
ACCAACACCA 
CTTCACTACC 
GTCA0GGT6A 
GAGCTCCAG6 
CAGGAGCOCA 
GTCCTCTOCC 
GAA6CTCCAG 



21 
I 

AAAGGAGAAG 
CACCCTCASA 
TTCTCCTCTT 
TGCCCTCAGG 
TAACAGCGCC 
GGAC OSCACC 
O00C3CCCCAA 
CTGCCASGAA 
CCCAGCCA6C 
06GGGGACCC 
AC06AAAGGA 
GOCAOCOCTA 
ACATG6CCCA 
A6AATGAGCT 
TCACCACCAC 
ATCCTGAGC^ 
AGTGCACATA 
TGAACCTGTC 
TCCTGQCCAA 
TCTCOGTCTA 
AGGCCTTCAT 
TGGACCTGCA 
GCGCTAA6AT 
TCTOCTCAGC 
CAAGTTACOC 
AGGGCCAGAA 



31 
I 

TT6GTTTAGA 
OTCTSACAGC 
CCAAGATGCr 
AGCCC0GGA6 
CCCCAGTTCC 
CTCTG CACAT 
GCAC6CCTT0 
6CAGCT6A60 
GTCCCAGGGC 
GGACCCCATC 
GAGTG0G6TC 
TGTGGOCCAC 
QGAGGCOCCC 
GACTG«3TCA 
OGTCATCACC 
GTACATTGAC 
CAAOGTQACA 
CGATG6GGAA 
CCAGACACTC 
CTTC06GA0C 
GCTGAGCTGC 
CTCAGGTG6G 
OCTGACATGC 
TCCTTCTGGA 
TGAAAACACA 
6CT6CA0CTO 



41 
I 

GGCCA6C06G 
AGGXGAAOGT 
CTTCCGGAG6 
AGAGGCAGTC 
TCACAGTCGG 
CAOGAGATCC 
OCCCCCAAOA 
CCCAAG6CCA 
CTAGATCTCC 
GTG6CCTC06 
CCTACAACAC 
ACACTCCCOC 
CAGGA6GACA 
GCCTCAGAGG 
ACOGAGCAGG 
TGCAGCQACT 
GTCTACACTG 
CTGCTCTCCA 
CTGGTGGAGG 
TT0CAGGA06 
AACTTTCCCC 
GTGGCCCACT 
ATCAATGCCT 
GGG6CAGTGC 
AATtSSGAGCC 
CACTTTGAGA 



51 
I 

AOGAGCTTTG 
CCTAAA7CTC 
GAGATCCTA6 
CTGGCAAAGA 
OGGAAGTGCT 
CAQCCCTGTC 
AGAAACTGCC 
CCTCC96CAGC 
TCTCCTCCTC 
AGGAGGCATC 
COGCACCCCT 
AGA06CCAGA 
CCAGCCCCAT 
AGAGCCAGGA 
CACC3U3CrCT 
ACCCACTGCT 
GCTATG6GGT 
TC060G06GT 
GGCAGGTAAT 
ACQGCCTTGG 
GCOGGOCTGA 
TTCACTGCCA 
CCAAGCCGCA 
ACAATGCCAC 
AATTCTGCAT 
GGCTGTTGCT 



360 
420 
4B0 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
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GCATGiVaVAG g;^CAGGAT6A. QGGTTCftCAG 
CGACTOCXnT CAAACOGAGA GTGTCCCTTT 
CO OCATOG AG TTCA0GTC06 ACCA6GCC0G 
AGOGTTTCSAO AAAOGOCACT GCTAIGAGCC 
G8ACCCXSA0C TATAACATTG G6ACTATAGT 
GGAGCAGGGC CCGGCCATCA TCGAATGCAT 
AGAGCCCXTPG TGCaCAGCCA TGTGTGGTGG 
GTCCCCAAAC TGGCCOSAGC CCTAOGTX3GA 
GGGAGAAGAG AAACGGATCT TCTTAGATAT 
CTTGAOCATC TACGATGGCG AOGAGGTCAT 
CAGTGGCC30C CAGAAACTGT ACTCCTCCAC 
CCCTGCTGGC CTCATCTTTG GAAAGGGCCA 
AAGGAATGAC TCCTGCTGGG ATTTACCOGA 
CACX5GAGTTG GTGCGOGGAG CCAGAATCAC 
GGGGAGTGAC ACCCTCACCT GCCAGTGGGA 
TGAGAAAATT ATGTACTGCA C0SACCC066 
GGATOCTGTG CTGCTGGTGG GGAOCACCAT 
TGAAGGGAGT TCTCTTCT6A CCTGCTACAG 
TCGCCTGCCC CACTGOGTTT CAGAAGOGGC 
GGCCCTOGCT ATCTTCATCC 0G6TCCTCAT 
TTACATCACA AGAT GTOGC T ACTATTGCAA 
CTACAGOCAG ATCA0CGTG6 AAA006AGTT 
CCAAAAGGTT TAGGGTrrCA TTTAAAAAGA 
AACCCCAATT TCCCCGAGAC ATTTATCCAA 
AAAGGOGGCT GTTTTTTGGT TAAACTTTTT 
TTTATAAATT TTAAAAGTG 



0GGGCA6A0C AACAAGTCRG CTCTTCTCTA 1620 

TCSIGGGCCTG CTGAG06AAG GCAACACCAT 1680 

GGCXX5CCTCC ACCTTCAACA TCCGATTTGA 1740 

CTACATOCAG AAT666AACT TCACTACATC 1800 

GGAGTTCACC T606ACXX3C6 GCCACTCOCT 1660 

CAATGTGGGG GACCCATACT GGAATGACAC 1920 

GGAGCrCTCT GCTGTGGCTG GGGTGGTATT 1980 

A6GTGAAGAT TGTATCTGGA AGATOCAOGT 2040 

CCAGTTCCTG AATCTGAGCA ACAGTGACAT 2100 

GCCCCACATC TTGGGGCAGT ACCTTGGGAA 2160 

GCCAGACTTA ACCATCCAOT TCC3VTTCGGA 2220 

GGGATTTATC ATGAACTACA TAGAGGTATC 2280 

GATCCAGAAT GGCTGGAAAA CCACTTCTCA 2340 

CTACXaGTGT GACXXX3G6CT ATXSACATOGT 2400 

CCTCAGCTGG AGCAGOGACT CCCCATTTTQ 2460 

AGAGGTGGAT CACTOGACCC GCTrAATTTC 2520 

CCAATACACX: TGCAACCCXX3 GTTTTGTGCT 2580 

CCGTGAAACA GGGACTCCCA TCTGGACGTC 2640 

AGCAGAGACX3 TCX^CTGGAAG GGGGGAACAT 2700 

CATCTCCTTA CTGCTGGGAG GAGCCTACAT 2760 

CCTCOGOC T G CCTCTQATGT ACTCCCACCC 2820 

T6ACAACCCC ATTTAOGAGA CAGOGGGAAC 2680 

GGTACCCTTT AAAAAOGGGC TTGTGAACTC 2940 

AGGCCCTGGG GGCCTTGATT TAAACCCCCA 3000 

AACAAAGGGT TAOGGGTTTT TTCCCCGGAT 3060 



Seq ID HO I 208 Protein sequence: 
Protein Accession 9: HP_06693B 

1 11 21 31 41 51 

I I I I I I 

MAQEAPQEDT SPMAmDKGB NELTGSASEB SQBTTTSTII TTTVITTEQA PALCSVSFSN 60 

PBGYIDSSDY PIiliPiaOIFLE CTYNVTVYTG YGVEIiQVKSV MLSDGELLSI RGVD6PTLTV 120 

LANQTXJiVEG QVZRSPTOTI SVYFF^TPQtXD GLGTFQiaYQ APMLSCHFFR RPDSGDVTVM IBO 

DliHSGGVAHF HCHLGYELQG AKKLTCINAS KPHHSSQEPX CSAPGGGAVH HATIGRVLSP 240 

SYPSmiGSQ FCIKTIEAPE GQKTjHT^HFER LLIiKDKDRMT VHSOQTKKSA LLYDSLQTES 300 

VPPBGLLSBG NTIRIEPTSD QARAASTFNI RFEAFEKCffiC YEPYIQNGNP TTSDPTYNIO 360 

TIVBFTCDFG HSLBQGPAII BCIMVRDPYW NDTEFLCRAN OGGSLSAVAG WLSPNHPEP 420 

YVESEDCIWK lEVGBBXRlF U3ZQFLNLSN SDZLTZYIXa) BVMPBZLGQY LGNSOPQKLY 480 

SSTPDZjTZQF H5DFAGLZPG KOOGFIMNYI EVSHNDSCSD LPEIQMGHKT TSHTELVRGA 540 

HITYQCDPGY DIVGSDTLTC QWDLSWSSDP PFCEKIMYCT DPGEVDHSTR LISDPVLLVG 600 

TTIQYTCNPG FVLBGSSLLT CySRETGTPI WTSRLPHCVS EAAAETSLBO GNMALAIPIP 660 
VLIZSI*LLGG AYIYITRCRy YSNLRLPLMY SHPYSQITVB TETONPIYET GGTQKV 



Seq ZO MO: 209 ONA sequence 

Nucleic Acid Accession «i MM_001327.1 

Coding sequence t 89-631 ^ 

1 11 21 31 41 51 

I I I I 1 t 

AGCAGGGGGC GCTGTGTGTA CCGAGAATAC GAGAATACCT CGTGGGCCCT GACCTTCTCT 60 

CTGAGAGCOG GGCAGAGGCT CX3GGAGCCAT GCAGGCCGAA 6GC0GGGGCA CAGGGGGTTC 120 

GACX3GGCGAT GCTGATG6CC CAGGAG6CCC TQGCATTCCT GATGGCCCAG GGG6CAATGC 180 

TGGOGGCCCA GGAGAGGCGG GTQCCA06GG OGOCAGAGGT OC00GQG6C6 CAGGGGCAGC 240 

AA6GGCCTCG GGGCOGGGAG GAGGGGCCCC GOGGGGTCOG CATGG0GGC6 0GGCTTCAG6 300 

6CTGAATGGA TGCTGCAGAT GOGGGGCC3U3 GGGGCCGGA6 AGCCGCCTGC TTGAGTTCTA 360 

CCTCGCCATG CCTTTCGCGA CACXXATGGA AGCAGAGCTG G(XCGCAGGA GCCTGOCCCA 420 

GGATGCCCCA CCGCTRXCG TGCCAGGGGT GCTTCTGAAO 6AGTTCACT0 TGTCOGGCAA 480 

CATACTGACT ATCXX5ACTGA CTGCT6C3W3A CCA0CX5CCAA CTGCAGCTCT CCATCA6CTC 540 

CTGTCTCCAG CAGCTTTCCC TGTTGATGTG GATCA06CAG TGCTTTCTGC OCGTGTrTTT 600 

GGCTCAGCCT CCCTCAGG6C AGAGGCGCTA AGCCCAGCCT GGCGCCCCTT CCTAGGTCAT 660 

GCCrCCTCCC CTAGGGAATG GTCCCAGCAC GAGTGGCCAG TTCATTGTGG GGGCCTGATT 720 
GTTTGTCGCT GGAGGAGGAC GGCTTAC3VTX3 TTTGTTTCTG TAOAAAATAA AACTGAOCTA 



Seq ID NO: 210 Protein sequence: 
Protein Accession 8: NP_001318.1 

1 11 21 31 41 51 

I I I 1 I 1 

KQAB6RGTGG STGDADGPGG PGIFDGPGGN AGGPGEAGAT GGROPRGAGA ARASGFGGGA 60 
PSSFHGGAAS GXaSOCCRCQA RGPESRLLEF YUVKPFATSM EASLABRSLA QDAPPIiPVFG 120 
VLLKBFTV8G VZXiTIRLTAA DHRQIiQLSZS SCLQQLSLLN tfZTQCFLPVP LAQPP9GQRR 



Seq 10 KG: 211 DHA sequence 

Kucleic Acid Accession #i Eos sequence 

Coding sequence: 52-459 

1 11 21 31 41 51 

I I I I I I 

CCTOGrGGGC CCTQACCTTC TCTCIGAGAG GCGGGCA6AG GCTCCGGAGC CATGCAQ6CC 60 

GAAGGCCAGG GOVCAGGGGG TT06AC6G6C GATGCTGATG GCCCAGGAG6 CCCTG6CATT 120 

CCTGATG6CC CAGGGGGCAA TGCTGGCGGC CCAGGAGAGG 06GGT6CX3VC GGGGGGCAGA 180 

6GTCCC0GGG GOGCAGGGGC AGCAAGGGCC TCGGGGCOGA GAGGAGGCGC CCCGCGGGGT 240 

CCGCATGGG6 GTGCCGCTTC TG0GCAG6AT GGAAGGTGCC CCTGOGGQGC CAGGA6GCCG 300 
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GACAGCOGCC TGCrTCAGTT CCGACTGACT GCTGCAGACC ACCGCX3U«:T 6CAGCTCTCC 360 

ATCAGCTCCT GTCTCCAGCA GCTTTCCCTG TTGATGTGGA TCACGCAGTG CTTTCTOCCC 420 

GTGTTTTTQQ CTCAGGCTCC CTCAQGGCAQ AGGOSCTAAG CCCAGCCTGG OGCCCCTTCC 480 

TAGGTCATGC CTCCTCCCCT AGGGAATGGT CCCAGCACGA GTGGCC3«?rr CATTGTGGGG 540 

GCCTGATTGT TTGTOGCTGG AGGJ^GGAOGG CTTACATGTT TGTTTCTGTA GAAAATAAAG 600 
CTGAGCTA 

Seq ZD NOt 212 Protein sequence; 
Protein Accession #: Eos sequence 

1 11 21 31 41 51 

I t 1 I I I 

HQAEGQCTGG STTCOPDSPCQ PGIPDGPGQf AGGPGEKBAT G6RGPRGA6A ARAS6PSGGA 60 

PRGFHGGAAS AQD6RCP0GA RRPDSRLLQF RLTAAOKRQIi QLSZSSCLQQ LSLLKWITQC 120 
FLFVFLAQAP SGQRR 

Seq ID NO: 213 DHA sequence 
Kucleic Acid Accession it )IM_OOOS55 
Coding sequence t 416.. 1498 ~ 

1 11 21 31 41 51 

I I I I I I 

CTTATTTTTT ATGAATGTOG 6ATAGCTGCA CCAGCTTGGT GGGGAAAGGG TTTGATQAAT 60 

AGCACAAAGA CACTGGCTGT TCCCTGGAGG CTGTCCSrrTT AAAGGAGAAT CTTAGTTTAT 120 

TCT6GGG6QA GGG6ATGCAC ACATTAGAGT A66AAAGAGG GCTTGGAATA AAATGAAAAC 180 

ACTCCOOCTT CATAGTCATT OTACTOAAAT GCAAAGACTG CTTCCTAAGC TGGAGATGCT 240 

AACCTTG6QT AGCTCCTTCT GTTCTCTTCA AGGGGAATTT TGTCAGGCTA TOGATTCATT 300 

TACAACTGTT AGTCATGTGG GCATGTGTGA GGAAACAGAT GOCAGTTTTA ATGTATTTAG 360 

CCCGAAGTTC CAATTTGATA GGAGCCACTG TCAGTCTCTG AGGTTCCACC AAAATATGGA 420 

ACTTGATTTT GGACACTTTO ACXSAAAGA6A TAAGACATOC AGGAACATGC GAG6CTCC06 480 

GATGAATOGQ TTGOCTAOCC CCACTGACAO GGGCCACTGT AGCTTCTAOC GAAOCAGAAC 540 

CTT6CAGGCA CTGAGTAATG A6AAGAAAGC CAAGAASGTA CX3TTTCTACC 6CAATG0G6A 600 

OCGCTACTTC AA6GGGATTQ TGTAOGCTGT GTCCTCTGAC 0GTTTT06CA GCTTTGAC6C 660 

CTTGCTGGCT GACCTGAOGC GATCTCTGTC TGACAACATC AACCTGCCTC AGGGAGTGOG 720 

TTACATTTAC ACCATTGATG GATCCAGGAA 6AT0G6AA6C ATGGATGAAC TGGAGGAAGG 780 

OGAAAGCTAT GTCT6TTOCT CAGACAACTT CTTTAAAAAG GIGGAGTACA CCAAGAATGT 840 

CAATCCCAAC TGCTCTgrCA AOGTAAAAAC ATCTGOCAAT ATGAAAGCOC CXXA6TCCTT 900 

GQCTAGC3W3C AACAGTGCAC AGGCCAGGGA GAACAAGGAC TTTGTGCGCC CCAAGCTGGT 960 

TACCATCATC CGCAGTGGGG TGAAGCCTCG GAAGGCTGTG OGTGTGCTTC TGAACAAGAA 1020 

GACAGCXXSVC TCTTTTGAGC AAGTCCTCAC TGATATCACA GAAGCCATCA AACTGGA6AC 1080 

0GGG6TTGTC AAAAAACTCT ACACTCTGGA TGGAAAACAS GTAAC7TGTC TCCAT6ATTT 1140 

CTTTGGTGAT GATGA7GTGT TTATTOCCTG TGGTOCTGAA AAATTTCGCT ATGCTCAGGA 1200 

TOATTTTTCT CTGGATGAAA ATGAATGCCG AGTCATGAAG GGAAACCCAT CAGCCACAGC 1260 

TGGCCCAAAG GCATCCCCAA CSVCXTTCAGAA GACTTCAGCX: AAGAGCCCTG GTCXriATGCG 1320 

COGAAGCAAG TCTCCAGCTG ACTCAGCAAA CGGAACCTOC AGCAGCCAGC TCTCTACCCC 1380 

CAA6TCTAAG CAGTCTCCCA TCTCTAOGCX: CACCAGTGCT GGCAOCCTCC GGAAGCACAA 1440 

GGACCTGTAC CTGCCTCTGT CCTTGGATGA CTOGQACTOQ CTTGGTGATT CCATGTAAAG 1500 

GAGGGGAGAG TGCTCAGAGT CCAGAGTACA AATCCAAGCC TATCATTGTA GTAG6GTACT 1560 

TCTGCrCAAa 'TGTCCAACAG GGCTATTGGT GCTTTCAAGT TTTTATTTTG TTGTTGTTGT 1620 

TATTTTGAAA AACACATTGT AATATGTTGG GTTTATTTTC CTGTGATTTC TCCTCTGGGC 1680 

CACTGATCCA CAGTTACCAA TTATGAGAGA TA6ATTGATA ACCATCCTTT QGG6CA6CAT 1740 

TCCAGGQATG CAAAATGTGC TAGTCCATGA CCTTTCAATG GAAAGCTTAG GGGCCTOGG6 1800 

TAAATTTGCC CCGTTTAAAT TTGCCCAAAC AGTTTTOCTT TTGTAGAGGG GTGTTTAAAT 1860 

ATACAGCAAT TAAAAAGTTT GTGTGGGGAA AAAAAAAACT CATTGGCAGA TCCAAGAATG 1920 

ACAAACACAA GTGCCXXMTT TCTCTGGATC TCftAGAATGG TGGAGGACCC TGGAAQGACA 1980 

GCAAGGCAGC TCCCCAGCCT CACTCTTCAC TCCTGATTGA GGCCOGGGTT TGTTGTCCAG 2040 

CACCAATTCT GGCTGTCAAT GGOGAGAAAT AAACCAACAA CTTATAATTG TGACACCAGA 2100 

TGCTTA6GAT GCTGGT6CTG GGITAGCTAA GA6AATAGAC AGAATTGGAA AATACTQCAG 2160 

ACATTTCOGA AGAGTTTATA AA6CACAGTG AATTCCT6GT CAATCTCTCC ACTGAGGCAA 2220 

TTTOGAATCa ATAAGCAATT CATAATAGTT TGGAGTAAGG GACTTCATAr AOCTGATTCC 2280 

TCTAGAAGGC TGTCTAACAT ACCACAT6AT TACATGAACT GTATGGTATC CATCTATCTC 2340 

T6TTCTATT6 AATGCCTT6T TAACAGCCAA CACTGAAAAC ACTGTGAGAA TTTGTTTTCA 2400 

GOTCTGACAC CTTTCAfiTCT CTTTTTATAO CAAQAAATCA ATATCCTT7T TATAAAAATT 2460 

CATGTCTGTA TTTCAGOAGC AAACTCTTCA GGCTCCTTTT TTATAAACT6 STGATTTTTC 2520 

TTTTGTCTAA AAAACACATG AAGAAAATTT ACCAGAAAAA AAAAAAAAAG COGAAGAATA 2580 

ATGTTATTTA GAAATTATGC TGTCACTGCC AAACAGTAAC CTCCAGGAGA AAACAAGATG 2640 

AATAOCAGAG GCCAATTCAA TAGAATCAGT TTTTTGATAO CTTTTTAACA GTTATGCTTG 2700 

CATTAATAAT TTCAATGTGG ACCAQACATT CTAATTATAT TTTAIUmSAA ATGTTACAGC 2760 

ATATTTTAAO CAACTCTTTT TATCTATAAT CCIAATATTT GATACT6AAG ACACAGAAAT 2820 

CTTTCACTTG TCTTTAACAT TAGAAAGGAT TTCTCTTTAC TAAGGACTGA TCATTTGAAA 2880 

TAGTTTTCaG TCTTTTGAGA TACAGGTTTA TAACACTGCT TTTTTTTTCC TGTAAACATA 2940 

GGCCATAATG GCAAAAACAA CTAATTTTAA TTGAAGGTCT TGCTTGCCAK TCCTGT6TTG 3000 

GCTTTNACCA AAXATAAAAA TTCCCTrATT CCTTOGTAAT OOTGCAAAIN TTTGGAAAGQ 3060 

CACAGCATCC AAACCAAGCT GCT6TTTGGC TACTGAATG6 CTTQCAGTTG TTCCTOCACT 3120 

CTAAATGGAA TGAGCTTGCT GTGTGTGTGT GT6GTGGTt3G TGGGAGGGGG TGGTGCATGT 3180 

GTGTGTGTQT GT6TGCATCT GCAGCTGCTT CAAAATTAAG AAATACTACA AGACACCCCT 3240 

GTAATGQUTT G6TGGCAACT GGGT6GCACT GCTGATGTGC ACTGTCTAGG GGGGAACGCA 3300 

GTGGTGGTG6 GGTATCTCAA ATGOCCCTAG ACAAOCTTCA GAltGTCTGTA GCTACOUUUk 3360 

ACATTTTGG6 TTCAAGAAAA GTGAGATGAT GGTA6TACTQ (j m 'd U ri't i AAATT6AAAA 3420 

ACCCCAAATG ATGA6GATCT CTTTTTGCXC CCAO'CLTrr TTTTGTAAAC CCATTCAAAA 3480 

CCATTAATAA GCX3CATTTTA CTAANCCCCT ATTTCTTTCT AGAAGCTCAG GGTTnJCTTA 3540 

GT6CCTCCCA NAACArTTTG TAGTTAATTG GGAAAAAGTG ATACTT6GAT TAGGGGGTGT 3600 

GGGCATAAAG AAJG GTGG GA GGCTTGATTT TAAAATTCAO GCCAGAACCC OCAATGACTC 3660 

CACCCATAGT NTCACTTTAG GTCTCATTTA GTOCATCAOC TTTATTTTAA 6TTGAGQAAG 3720 

TGGA6GCTGG TAAAGA6CAG GACCAGAOGA AGAATOCAGA TTTCCTTATG CTTGG6CCTC 3780 

ACACTAGCTC TOTGAGTATT TCCTTGATTG OGGTATATGT ACTACTAGAA AATACCAAAT 3840 

GOATATATTT TCTTTAGGAT AACXTTTGAA CCAACAAIMT TCAATAACRA TAGTACATCT 3900 
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TCCATCTOC TTTTAKtOGA GTATAAOGftA ATGTTTCTTT ATQGCCATTT TGGA06GAGC 3960 

A6GGGATGA6 6CTTGGOVTA GTCCAAAATT TAAC2TCTCCA ATAATTAATT GGATTTTAAA 4020 

TTGTTTTAAA TTGGCCCACT TTCAAGGCAA mTrm ' G T GTGTCTGTAA CTGAGCTCCT 4080 

CCACCCCTGT CATTCACTTC CAATTTTACC CAATCCAATT TTAGCACTCA AGTTCCATTG 4140 

TGTTAATTTT TGCAOGGTCT ACACACATCA AGTCAGCAAG CATTTGCCAC CACTCCCTAT 4200 

ACTTCTOCCI CTTTTTTACA CACACACACA CACACACACA CACAATCCAT CTCTTGCXTG 4260 

TTCCTACCTC CCTGATTTTT CTTCCCTACA GAAATAGAAA TAGGGACAAA GAAGGOQAAA 4320 

ATX3TATATAT TGaSGCTGGG CTGAACAACT AACTTCATAA GTAGTATTAA CTAGGGGTAA 4380 

ATTQAGAGAA AAGCTCCTTT TCTCTTCACT GTTTTGGAAA OGATAGCCAT TAGCATGACT 4440 

GCTTTGTGTC CTTATGGACT TTAGTATTAG CCTAGATTGA ATTATAGOGT TTTTCTAGCT 4500 

GAAGGAACCT TAA6ATCACA TCATCT A CTC CTCTACTOCA AATTTCTCAT TCTTCAGGOC 4560 

AG6AAACCSA 6ACACAGAGG TAAAGTAATT TCCCCAAGGT CACACAGCTG GCT6GGGCAG 4620 

GATTGGGTTT ACAACCCACA TCTCCTGGCT CTTATTCCAG GGCXTTTTTOC CACTAAGTAO 4 6 BO 

TATTGCCTTC CATTAGGCTC CTGAGAGTTA TTTCTCAGGQ TCATGrTGCA TCTTGGAGCC 4740 

ACATGCTGCT GCCXTCATCT CAGTGGGAAA TNCACCCAGC AACCTAATAC AGCCCCTTTT 4800 

CCCTGCATTC ACCTGGTTCX: CATCCACATG GGTTGCAGAT GTCCTTGAAG A6ASTGAGGC 4860 

ATTGAGGGCC AATAGGAGCA ATGGGGTOCC TGGCCTTGTC CATCTGAITC AGGAGATCAC 4920 

TGCTCC31T0G TGAGGAGCCC TCTGAATAGC CCCXCACTGA ATGCTTGCXTT TGCCCAAATG 4980 

GAATGGAGGA AGATTGATTT TCTCCATCAG TTCACCTTGT GTCATCTCAT AATGGTTGGT 5040 

CTTTCCAGGC TGAGGGAAAT (jTl ' iVnXjTr TCCAKA6TAII AAAAAAGAAA GAGTGGAACA 5100 

ATANCTTTGT TCATCCTAAC TTTCTGAGAT GGCTTTTCAA CATTTAAAAA AAACTAGTOT 5160 

GGTACX31TTC ACTGGCANGA TTTNTTTTAG AATATGGGAG TAAGATGAGG TAGAGAAAAT 5220 

AACCTGGTCT CACTGTGGTT GCCCTCATCC ACAATGTCCC CAAAGCCATC CTGCTNTGAT 52 80 

GAGGACAATT TCCAGGTATA AGCAAGGGGC TTTGTGACAA AAATGTACCC TGGCTGATGT 5340 

TAAACATTGG CTCCTGTGTT TGCACCAAAA TAGCAAGCTG TGT6CTCTAT ACACTCTTCC 5400 

CATOSTCTTO TGTACACTGC TCCTGTGGCC TTCCACAGCA GAAACCACGG CAAAAGGGTC 5460 

CAAACACATG GTTTTCCTTG CTGCAAGGCT NTTCCTGGGA ACTAAGGGGG TATTTATTAG 5520 

TTCAGTTNTA AGAGACCTCC TTCT66GCTT AOCCCACTCC TCAGGTACTT CTCTCTCrTT 5580 

CCTCCTTCTC CTOCACAfiTC ACAAGTAACC AAGGAACCTG AAAGTGGATG TGTAGCTATT 5640 

T6AAGAAG6C AAG6AACCCT GAGATTCTTC TTTGAATCCT TTAGTCCAAG TCTTAGACCA 5700 

GTGATTGGTG CTTACCTTGA ACAAAATTTT GTCTGTGTTC CTAATCCCTT CAATACTNTG 5760 

GOTACAATGC TCOCAATCAC 0CT6CACATT TGATTCTAAA TGGCTTTTAI TTTTTAAAAA 5820 

TCCATATCCC TAGGACAAGA NAACAGGATG CCTATATCXX: CAAAATGAGC TCCAGGACAC 5880 

TGATGGGAAT GATCCCAAKG ATCACCCdC CTCAGAAAAC GTCTGTGCCA AMAGACTTOC 5940 

CCAGATAGAA NCACTGQGAC AGTGGTTTGA ACGACTTCTT TTATGGTTGT CCAGTTTGCT 6000 

ATGGAAATAA AAGGCATTGA TTTTTTAAAA AAGATGATTG GAACCTGTCT TTGGCCACAT 6060 

AG0GCCACT7 GGAT0CATT7 CCAGGCCTTA CTCATATATT GOCTTCACTG AAGGGCTTTG 6120 

6CTTTAAGTC CCAGACTOGT CTCCCAAGTO AACCATAAGT GTTTTGGAGC TCATCT6GGG 6180 

TOAOOCATGA GAATGTTGCC CCATCTATCC CTTCAG6AAA AGGTGCCTTC CCTOCTTTC 6240 

TCCTAAAGCC TGGTCCCCAA AAATTGTTTT TGTCTCCAAA AGTCTAGTAT GGTCTTTATA 6300 

CACCCANACT CTTAGTGTTG CXSTCCTGCCT TGTTTCCTTQ TTAAGGATCT ATGCANACCT 6360 

CCOSCTTTGG CTTAGCTAGC GTGACATTGG CTATGATTTO ACAAGACTAA CTTTTTTTTT 6420 

TTTTTTTTTO ACTOAGTCTC CCTCTOTCAC CTAGOCTGGA GTGCAGT06C ACAATCTTGG 6480 

CTOOCTGCAA CCTTCACCCT TCACCTCCCA QGTOGAAGCO ATTCTCCTGC CTCAGTCTCC 6540 

CGAGTAGCTG GCATTACAGG CX3TGCGCCAC CAAATCTGGC TATTTTTTTA TTATTATTAT 6600 

TTTTAGTAGA GATGGGGTTT CACCATGTT6 GCCAGACTGG TCTTGAACTC TTGGCCTCAA 6660 

ATTATCTGCC CACCTOGGCC TCCCAAAGTG CTGGGATTAC AGGCATGAGC ACCATGCCCA 6720 

6CTGACAAGA CTAATTTTTT ATCOCTTGOT TTATTCGCTT CRACATCTTC TGGAATCAGA 6780 

GOTSATTTTT TCTTACCTT6 6ATGCCTGAG ACTAOGGGAG TATAGAATTC CAATTGGTAA 6840 

TTAAGGCATC TTTCTGCTCC TGATCAGAAG GGCAGGTTAG TTGGGAGAGG TCAGATG6CA 6900 

CAACAGAAGT CACCTTGTAA GTAAGGCAAA GACTTTGAAG GCATTAGOGT TTCTCATTAC 6960 

TTAGGTCAAT AACCTTGAGG 6AATCAATG0 CTTTTTTGCC 6CTCTACCTC TTTGTGTATC 7020 

TCTTTQACTT TTCTTTCTCT GTCTAUTTTC CTCrGTTCTC AGTTTATATT CTATGTTATC 7080 

AOTCTCTCTT TCCACAGTAC AAACATCX3VT CCTTTCTCCT GTGCAATTCT 6TCTCTCCCT 7140 

CTTATTATCT TTATTTGTAC TTTTTCCTTC CTCCCTGTCT AGGCATTQGG CATGTGCCTC 7200 

TTCTTAGCCT GTGATTTTGC CTTGGGACTG ATGATAAATT ATTTCCAGAT TCAATCAGCC 7260 

CTGGTCCTAC CCCAGTCCAA TCAGAAGTAT GTTGGTGGGG AATCAACCTG ATCCTGGCCC 7320 

' mfrifn i : ' tccattttca ttostaatcc coctcagcag atctttacaa gcastttcct 7380 

TATAOCTCAT GTATCTTTAO GTCTTTGCCT TOCAAGCACT GTACA6AATA CTTTGTGGTT 7440 

CCTTTTTAGT CTGACATTTT GTOGAGCAGT GAAGOGTGCT CAGA6ACATA ATCAGCTGAA 7500 

GAGAAAAAAT CXACCCATGG ATTTATATCA GCTAAATACT AATAATTGAT nTGTTTGAT 7560 

GTGCOCATAA TTTTTAAAGC TGCAATATAA TATAATGAGG GACCACAG6T AATTTCTCXTr 7620 

GTCATTTGTT TT66CTGGAT G6GGGTG6G6 GA6TAATT6C TIAAAGTTTT ACCATTACAC 7680 

ATTAAACTCT CTATAATAAT CTTGTTrGGG GCTTGCTAAC TGTTGAGCTG TTTTAACTAA 7740 

ACTGGTAOGC AATC566AGTT GATTTAAATG AAAAGATAAT TTAACAAATC TATACTATAA 7800 

AAAGAGACAT TTGCTTAATT GACATGTATT TTTTCCTTCT GAGTCACCTA AACAnTACT 7860 

CTTGACACCA ACXGTTCAT6 ATACTGAATA 6ACAGTCCAT ATAAGAGAAA TTAGTGGACC 7920 

TAAAGAAGCC A6ATTGTAGG TGTTAATTTA TTAAACAGAA T1GC3UAG0C CTTGGAAMt3 7980 

TCACT6CTTG GCAATACCAT ATGGCATGCC AAAATTTACA AIGACTTTTC TTTATAAGTT 8040 

ATCCAAAAGO GATTTQAACA AGTAA6AGGT TATGCCAAAA TGTCTCCAAT GTATG6TCCT 8100 

GTAATATATT GCAGCTTGAA CCCTIATGATC CXTTTATGACT TGTATACAAC TAATGCATGT 8160 

TTTATTGAAT TTTGCATTTC CCAOGTGTGG TAAGTCTTTA AAATGTTTTT GATCACCTTT 8220 

NTGTGCCATT AAACTT6TAC AGAAAATGTT TTTATGGCXSl TTTTCAAAGG GAOAAAOTTT 8280 

AAAATGGAAA CA6CCCACCC TTTCTGCCCT ATAGCTGTAG TTAGAATTGA GTACCT6TAG 8340 

CAAAACAOCT GTAATTGGTG GTTGTAGTGT TAGAGGTGTT AGCTTGCTAG TGACTAGCTT 8400 

TQGAGAGTAA ATGCATG6TA TTGTACATCA CATTTCTTAA CTCGTTTTAA CCTCTGAAAA 8460 

6AATATATTC TTCTTTGTAG TOCTTCTTCC CACCCCCTTG CCCTCTCCCT CTCCCTGCTC 8520 

CCAGTT6TCT TACAGTT6TA AATATCT6AT TTGAGGCCCA ATAACTCTTG CCAAGTAAA6 8580 

TCAGCAAACA ACAAACAAAC CAAAATGT6Q 6QAAAAG6CA TTTCTCAACC ATCTCTCA6C 8640 

AGTTATTGAT CATTTCTTAA GGAACAGCAT TGTGATCAAA GACTCAACTT TAOGTAAAAA 8700 

rCAGTQGTAA ATTG6GGTTG TATTGGCCAT XtSATTACATT CAGGATTGAA TAGTTTTCAG 8760 

AATCACATGT AATCCAAAGA CAGTAGGTA6 TGATGTCCCT TATCCCTGCA 6CTGTTTTAA 8820 

QATAOAGACC TCAGAAGACT CTGCTTGACC GAT6ACCAAT AATTAITTGA AAAAAAAAGA 8880 

AAAAATGAGA GAAATAAAAC AGATATTTAA 6AACTTTA6C CACCTATTTA GAATAGTTAT 8940 

AGCCAGAAAA AAAAACAAG6 GCATGAGTTC AAATGCATTA CTATCAGT6T CCTAGGCAAT 9000 

ACCTAACCTA CTCTGAAATT GTGATTCAAA AGCAGTATTT CAAGAGGCAT TCTCCTTTTT 9060 

I WmwrG ACCCCACnO GACTGGTAGG TTTGGTGAGG CCCCXSVTAAA CCAGCIGGAG 9120 
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CAGACCCTTT TCATCTCCTG TGCCTGTAAC ACCCCTCTTC CCCCACCCCC TCCX3CAATTC 91 BO 

AATGAGGGCT TTCTTGGGTC AGAGGACTTC AAGGTltJTCT AGAGAAGTrT GCCATGTGTG 9240 

TAAGGT6CTG TGAACTGT6A GTGCTGAAGA TTOGCAGCAT TCAATACCAQ 6CAGCCAAAG 9300 

ASCTGCrCTT GCMITTATTT TGQCTCTCAA Q C TC TO TTCT TCATCXSCMT Clt JV T Ti 'Ci'O 9360 
TGTACATTTO CAAGATGTGT GTAAltSTOlT TITCC3UUM TAAAATTTGA TTTCAAT 

Seq ID NO: 214 Protein sequence! 
Protein Accession ft: HP_000S46 

1 11 21 31 41 51 

I I I I I I 

KELDFGHFDE RDKTSRNMRO SRMNGLPSPT HSAHCSFYRT RTU2ALSNSK KAKKVrfYRN 60 
GDRYPKGIVy AVSSDRFRSF DALIADLTRS LSDNINLPQQ VRVmiDGS RKIGSMDELB 120 
ESSBStWCSSD NFFKECVEyTK NVKPNWSVNV KTSAMMKAPQ SUVSSKSAQA RQJKDFVRPK 180 
LVTIZRS6VK PRKAVRVLLN KKTAHSFGQV LTDITEAIKIi ETGWKKLYT LDGKQVTCLK 240 
DFFGDDDVFI A06PEKFRYA QDOPSLDENE CSWSKiSIPSn TAOPKASPTP QKTSAKSP6P 300 
MRRSRSPADS ANGTSS8QLS TPKSKQSPZS TPTSPGSLRX HXDLYLPLSXi DDSDSLCSSM 

seq ZD NO: 215 DNA sequence 
Nucleic Acid Accession it im_130467 
Coding sequence: 312.. 644 

1 11 21 31 41 51 

I I I I I I 

GGCACX3AGGC AGAGCTCTGC AAGGAGAGGT TGTGTCTTOG TTCTTTCCGC CATCTTOGTT 60 

CTTTCCAACA TCTTCGTTCT TTCTCACTGA CCGAGACTCA GCOGGTAGGT CTOCAGA£3TG 120 

GTCTTCCTGG TAATTTAGTT GTC3AGTGAAT GTGT6GAGGA GCCAGGGGGC TTAGGACAGG 180 

TOCIGTGQCA CAOTOOSTQO CTTTGAGGGA AAAGGGOCTC 60GGTGGTCC TCCGCCTTCC 240 

CCQUSGTOBT OATGCASGOG CCATG6GC03 GTAAT0STG6 CTQGGCTG6A AC6AC3G6AGG 300 

AAGTGAGAGA TATGAGTGAG CATGTAACAA GATCCCAATC CTCAGAAAGA 66AAATGACC 360 

AAGAGTCTTC CCAGCCAGTT GGACCTGTGA TTGTCCAGCR GOCCACTGAG GAAAAACGTC 420 

AAGAAGA6GA ACCACCAACT GATAATCAG6 GTATTGCAOC TAGTGG66AG ATCAAAAATO 480 

AAGGAGCACC TGCTGTTCAA GGGACTGA7G TCGAAGCTTT TCAACAG6AA CTGGCTCTGC 540 

TTAAiSATAGA GGAT6CA0CT GQAGATSGTC CT6ATGTCA0 GGAGGGGACT CTQCCCACTT 600 

TTOATCCCAC TAAAOTGCTQ GAAGCAGGTG AAGGGCAACT ATAGGTTTAA ACCAAGAGAA 660 

ATGAAGACTG AAACCAAGAA TATTGTTCTT ATGCTGGAAA TTTGACTGCT AACATTCTCT 720 
TAATAAAGTT TTACAGTTTT CTGCAAAAAA AAAAAAAAAA AAA 



Seq ID NO J 216 Protein sequence: 
Protein Accession «: NF_569734 

1 11 21 31 41 51 

I I I i I I 

MSKHVTRSQS SER6NDQESS QPVGPVIVQQ PTEEKRQESB PPTDNQGIAP SGBZRNEGAP 60 
AVOGTDVEAF QQELALUaE DAPGDGPDVR EGTZjPTFDPT KVLEAGBGQL 

Seq ID NOt 217 DNA sequence 

Nucleic Acid Accession ft: NM_001476.1 

Coding sequence: 82.. 435 " 

1 11 21 31 41 51 

I I i I I 1 

GCCAGGGAGC TGTGAGGCAG TQCTGTGTGO TTCCTGCCGT CCGGACTCTT TTTCCTCTAC 60 
TGAOATTCAT CTGTGTQAAA TATGAGTTGG CGAGGAAGAT OGACCTATTA TTGGCCTAGA 120 
CCAASQCGCT ATGTACAGCC TCCTGAAGT6 ATTGG6CCTA TGOOGCCCGA GCAGTTCAGT ISO 
GATQAACTGO AAOCAGCAAC ACCTGAAOAA GGGGAACCAO CAACTCAACS TCAaOATCCT 240 
GCAGCIGCTC AOGAGGGAGA GGAltSAGGGA 6CATCT6CA0 GTCAAGGGCC GAAGCCTGAA 300 
6CTGATAGCC AGOAACAGGG TCACCCACAG ACTG66TGTG AGTGTGAAGA TGGTCCTGAT 360 
GGGCAGGAGG TGGACCCGCC AAATCCAGAG GAGGTGAAAA 06CCT6AA6A AGGTGAAAAG 420 
CAATCACAGT GTTAAAA6AA GACACGTTGA AATGAT6CA6 GCIGCTCCTA TGTTOGAAAT 480 
TTGTTCATTA AAATTCTCCC AATAAAGCTT TACAGCCTTC TGCAAAA 

Seq ZD NOi 216 Protein sequence t 
Protein Accession fti HP_001467.1 

1 11 21 31 41 51 

I I I I I I 

MSHRGRSTYY HPRPRRYVQP PEVIGPKRPB QPSDEVEPAT PEE6EPATQS ODPAAAlQEGE 60 

DBGASAGQ6P ICPBADSQBCX3 HPQTGCBCED GPDOQEVDPP NPEEVKTPBE GEKQSQC 

Seq ID HO: 219 DNA sequence 
Nucleic Acid Accession ft: NM_001476 
Coding sequence: 90-3671 " 

1 11 21 31 41 51 

I i t 1 i 1 

ACAG06GAGC GCAGAGTQAG AACCACCAAC OGAGGCGCCG GGCAGOGACC CCXGCAGOGG 60 

AGACAGAGAC TOAGOGGCCC GGCAC06CCA TGCCTGCGCT CTX3GCTGGGC TQCTGCCTCT 120 

GCTTCTGGCr OCTCCTGGOC 6CA6G0056G GCAGCTCCAG GAGGGAAGTC TGTGATTGCA 180 

ATG6GAAGTC CAGGCAGT6T ATCTTTGATC GGGAACTTCA CAGACAAACT QGTAATGGAT 240 

TCOGCTGCCr CAACTCCAAT GACAACACTG ATGGCATTCA CTGOGAGAAG TGCAAGAATG 300 

GCTTTTACOG GCACAGA6AA A6GGACCGCT GTTTGCCCTG CAATXtSTAAC TCCAAAGGTr 360 
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CTCTTAGIGC TGGATGTG/IC AACTCTGGAC GOTQCAGCTO TAAAOCAGGT GTraCAGGAG 420 

CCAffllTGOtSA CCGATGTCTO CCACGCTTOC ACATGCTOIC GGATGOGQOG TGCACOCAAG 480 

ACCAGAGACT GCTAGACTCC AAGTGTGACT GTC5ACCC3W3C TGGCATOSCA GGGCCCTGTO 540 

AOGOGGGCCG CTGTGTCTGC AAGCCAGCTG TTACTX3GAGA ACGCTCTGAT AG GTGTO GAT 600 

CAGGTTACTA TAATCTGGAT GGGGGGAACC CPC5AGGGCTG TACCCAGT6T T TCTGCT ATg 660 

GGCATTCAGC CAGCTGOOGC AGCTCTGCAO AATACAGTGT CCATAA6ATC ACCTCIACCT 720 

TTCATCAAGA TGTTQATGGC TGGAAGGCTG TCCAAOGAAA TGGGTCTOCT GCAAAGCTCC 780 

AATCCTCACA GCGCCATCAA GATGTGTTTA GCTCAGCCCA AOGACTAGAC CCTGTCTATT 840 

TTGTGGCTCC TGCCAAATTT CTTGGGAATC AACAGGTGAG CTATGGGCAA AGCCTGTCCT 900 

TTGACTACCG TGTGGACAGA GGAGGCAGAC ACCCATCTGC CCATGATQTG ATTCTGG AAQ 960 

GTGCTGGTCT ACGGATCACA GCTCCCTTGA TGCCACTTGG CAAGACACTO OCTTCTGGGC 1020 

TCACCAAGAC TTACACATTC AGGTTAAATG AGCATCCAAG CAATAATTGG AGOCGCCAGC 1080 

TGAGTTACTT TGAGTATCGA AGGTTACTCC GGAATCTCAC AGCCCTCOGC ATOOGAGCTA 1140 

CATATGGAGA ATACAGTACT OQGTACATTG ACAATGTGAC CXTTGATTTCA GCCXX5CCCTQ 1200 

TCTCTG G AGC CCCAGC3^CCC TG6GTTGAAC AGTGTATATG TCCTGTTGGG TACA AGGGGC 1260 

AATTCTGCCA GGATTGTGCT TCTGGCTACA AGAGAGATTC AGOSAGACTG GGGCCTTTTG 1320 

GCACCTGTAT TCCTTGTAAC TGTCAAGGGG GAGGGGCCTQ TGATCCAGAC ACAGG AGATT 1380 

GTTATTCAGG GGATGAGAAT OCTGACATTO AGTGTGCTGA CTGCCCAATT GGTTTCTACA 1440 

AOGATCCQCA CGACXXXXXSC AGCTGCAAGC CATGTCCCTG TCATAACGGG TTCAGCTGCT 1500 

CABTGAT6CC GGAGAOGGAG GAGGTG G TCT GCAATAACTG CCCTCCCGGG GTCACOGGTG 1560 

COO Q CIGTGA GCTCTGTGCT GATGGCTACT TTGGGGACOC CTTTGGTGAA CATGGCCCAO 1620 

TGAG6CCTTG TCAGCCCTGT CAATGCAACA ACAATGTGGA CCCCA6TGCC TCTGGGAATT 1680 

GTGAC0G6CT GACAG6CA60 TGTTTGAAGT OTATCCACAA CACA6C0GGC ATCTACTGOG 1740 

ACCAGTGCAA AGCAQGCTAC TTOOGGOACC CATTGGCTCC CAACCCAGCA GACAAGT6TC 1800 

GAGCTTGCAA CTGTAACCCC ATGGGCTCAG AGCCTGTAGG ATGTOSAAGT GATGGCACCT 1860 

GTGTTTGCAA GCCAQGATTT GGTGGCOCCA ACTGTGAGCA TGGAGCATTC AGCTGTCCAG 1920 

CTTGCTATAA TCAAGTGAAG ATTCAGATGG ATCAGTTTAT GCAGCAGCTT CAGAGAATGG 1980 

AGGCCCTGAT TTCAAAGGCT CAGGGTGGTG ATGQAGTAGT ACCTGATACA GftGCTGGAAG 2040 

GCAGGATGCA GCAGGCTGAG CAGGOOCTTC ASGACATTCT GAGAGAT6CC CAGATTTCAG 2100 

AAGGTGCTAG CAGATCCCTT GGTCTOCAGT TGGCCAAGGT GAGGAGOCAA GAGAACAGCT 2160 

ACCAGA6CCG CCTGQATGAC CTCAAGATGA CTGTGGAAAG AGTTOGGGCT CTGGGAAGTC 2220 

AGTACCAGAA CCGAGTTGGG GATACTCACA GGCTCATCAC TCAGATQCAG CTGAGCCTGG 2280 

CAGAAAGTGA AGCTTCXTTTG GGAAACACTA ACKITGCTGC CTCAGACXAC TA CGTG GGGC 2340 

CAAATGGCTT TAAAA6TCTG GCTCAGGAG6 CCACAAGATT AGCAGAAAGC CAOGTTGAGT 2400 

CAGCCAGTAA CATGGAGCAA CTGACAAGGO AAACTGAGGA CTATTOCAAA CAAGCCCTCT 2460 

CACTGGTGOG CAAGGCCCTG CATGAAGGAQ TCGGAAGCGG AA6C6GTAGC CCGGACQGTa 2520 

CTGTGGT6CA AGGGCTTGTG GAAAAATTGG AGAAAACCAA GTCCCTQOOC CAGCAGTTGA 2580 

GIIAGG6A66C CACTCAAGG6 GAAATTGAAG CAGATA63TC TTATCA6CAC AGTCTCOGCX: 2640 

TCCTGGATTC AGTGTCTGQG CTTCAGG6A0 TCAGTGATCA GTCCTTTCAG GTG6AAGAAG 2700 

CAAAGAGGAT CAAACAAAAA GOGGATTCAC TCTCAACGCT GGTAACCAGG CATATGGATG 2760 

AGTTCAAGOG TACACAAAAG AATCTGGGAA ACTGGAAAGA AGAAGCACAG CAGCTCTTAC 2820 

AGAATOGAAA AAGTGGGAGA GA6AAATCA6 ATCAGCT6CT TTOCOGTGCC AATC TTGCTA 2880 

AAAGCAGAGC ACAAGAAGCA CTOAGTATGG GCAATGCCAC TTTTTATQAA GTTGAGA6CA 2940 

TCCTTAAAAA CCTCAGAGAG TTTGACCTGC AGGTGGACAA CAOAAAAGCA 6AAGCTGAA6 3000 

AAGCCATGAA GAGACTCTCC TACATCAGCC AGAAGGTTTC AGATGCCAGT GACAAGACCC 3060 

AGCAAGCAGA AAGAGCCCTG GGQAGOGCTG CTGCTGATGC ACAGAGGGCA AAGAATGGGG 3120 

CGGGOGAGGC CCTGGAAATC TOCAGTGAGA TTGAACAGGA GA7T6GGAGT CTGAACTTGG 3180 

AAGOCAATQT GACAOCSVaAT GGA6CCTTG0 CCATG6AAAA GGGACT QGOC TCTCT6AAGA 3240 

GItSAGATGAG GQAAGTGGAA GQAGA0CTG6 AAAGQAAGQA GCTGQA6TTT GACAOGAATA 3300 

TGGATGCAGT ACAGATGGTG ATTACAGAAG COCAGAAGGT TGATACCAGA GOCAAGAACG 3360 

CTGGGGTTAC AATCCAAGAC ACACTCAACA CATTAGACGG CCTCCTGCAT CTGATGGACC 3420 

AGCCTCTCAG TGTAOATGAA 6A666GCTG0 TCTTACTGGA GCAGAAGCTT TCCOGAGCCA 3480 

AGAOCCAOAT CAACA8CCAA CT60G60CCA TGATGTCAGA 6CTGGAAGA0 AGGGCAOGTC 3540 

AGCA6AGG6G CCACCTCCAT TT6CTG6ASA CAA6CATAGA TGGGATTCT6 GCT6ATGTGA 3600 

AGAACTTGGA GAACATTAGG GACAACCTGC CCCCAOGCTG CTACAATACC CAGGCTCTT6 3660 

AGCAACAGTG AAGCTGCCAT AAATATTTCT CAACTGAGGT TCTTGGGATA CAGATCTCAG 3720 

GGCTCGGGAG CCATQTCATO TGAGT60GT0 GGATGGGGAC ATTTGAACAT GTTTAATGGQ 3780 

TATGCrCAGG TCAACTGACC TGACCCCATT GCIGATCCCA T66CCAGGTG GTTGTCTTAT 3640 

TGCACCATAC TCCTTGCTTC CTGAT6CTG0 GCAAT6AGGC AGATAGCACT GOGTOTOAGA 3900 

ATGATCAAGG ATCTGGACCC CAAA6AATAG ACTGGATGGA AAGACAAACT 6CACAGGCA0 3960 

ATGTTTGCCT CATAATAGTC GTAAGTGGAG TCCTQOAATT TGGACAAGTG CTGTrGGGAT 4020 

ATAGTCAACT TATTCTTTGA GTAATGTGAC TAAAGGAAAA AACTrTGACT TTGCCCAG6C 4080 

ATGAAATTCT TCCTAAT6TC AGAACAGAGT GCAACOCAGT CACACTGTGG CCAGTAAAAT 4140 

ACTATTGCXrr CATATTGTCC TCTGCAAGCT TCTTGCT6AT CAGAGTTCCT CCTACTTACA 4200 

AOCCAOGGTG 'TGAACATGTT CTCCATTTTC AAGCTGGAAG AAGTGAGCAG TGTTGGAGTG 4260 

AGGACCTGTA AGGCAGGCCC ATTCAGAGCT ATGGTGCTTG CTGGTGCCTG CCACCTTCAA 4320 

GTTCTGGACC TGGGCATGAC ATCCTTTCTT TTAATGATGC CATG6CAACT TA GAGATTG C 4380 

ATTTTTATTA AAOCATTTCC TACCAGCAAA 6CAAATGTTG 6GAAAGTATT TACTTTTTOG 4440 

GTTTCAAAGT GATAGAAAAG TGTGGCTTGG GCATT6AAA0 AGGTAAAATT CTCTA6ATTT 4500 

ATTAGTCCTA ATTCAATCCT ACTTTTOGAA CACCAAAAAT GATGCXXaVTC AATGTATTTT 4560 

ATCTTATTTT CTCAATCTCC TCTCTCTTTC CTCCACCXZAT AATAAGAQAA TGTTCCTACT 4620 

CACACTTCAG CIGGGTCACA TCCATCCCTC CATTCATCCT TCCATCCATC TTTCCATOCA 4680 

TTACCTCCAT CCATCCTTCC AACATATATT TATTGAGTAC CTACTGTGT6 OCAOGGGCTQ 4740 

GTGGGACAGT GGTGACATAO TCTCTGCCCT CATAGAGTT6 ATTGTCTAGT GA0QAA6ACA 4800 

AGCATTTTTA AAAAATAAAT TTAAACTTAC AAACTTTGTT TGTCACAAGT GGTOTTTATT 4860 

GCAATAACCG CTTGGTTTGC AACCTCTTTG CTCAACAGAA CATATGTTGC AAGACCCTCC 4920 

CATGGGGGCA CTTGAGTTTT GGCAAGGCTG ACAGAOCTCT G6GTT6T0CA CATTTCTTTG 4980 

CATTCCAGCT GTCACTCTGT GCCTTTCTAC AACTGATT6C AACASACTGT T6AGTTATGA 5040 

TAACACCAGT GGGAATTGCT GQAGGAACCA GAGGCACTTC CACCTT6GCT 6GGAAGACTA 5100 

TG6T0CTG0C 'IT G CTI C mT ATTTCCTTG6 ATTTTCCTGA AA6T6TTTTT AAATAAAGAA 5160 
CAATTGTTA6 AT6CC 

Seq ID HOi 220 Protein sequences 
Protein Accession #:1IP_00S553 

1 11 21 31 41 51 

1 I I I I I 
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KPfdMhCCCb CFSLLLPAAR ATSRREVC3JC NGKSRQCIPD RELHSQTGJIG FRCLNOIDNT 60 

DGIHCEKCKK GFYKHRERDR CLPOICNSKG SLSARCDKSG RCSCKPGV7G ARCDRCLPGP 120 

HMLTQAGCTQ DQRLLDSXCD CDPA6IAGPC OAGRCVCKPA VT6ERCDRCR SGYYNLDGGN 180 

PEQCTQCPCY (SSASCRSSA EYSVHKITST FBQDVDGmA VQRNGSPAKX* QHSQRBQDVF 240 

SSAQRLDFVY FVAPAKFLQ* QOVSYGQSLS FDYRVDRGGR KPSABDVIIiB GAGLRXTAPL 300 

MPLGKTLPCX5 LTKTYTFRUJ EHPSNNWSPQ LSYFEYRRLL HNLTALRIRA TYGBYSTGYI 360 

DNVTLISARP VSGAPAPWVE QCICPVGYKG QPCQDCAS6Y KRDSARLOPF GTCZPQIOQG 420 

GGACSFDTCa) CYSGDaJPDl ECADCPI6FY KDPHDPRSCK PCPCBNGFSC SVMPETSBW 4B0 

CNHCPPGVTG ARCEIiCADGV FGDPP6EHGP VRPOQPOQOI NNVDPSASGN ODRLTGRCLK 540 

CIHNTAGIYC DQCKAGYFGD nAFNPADlCC RAOfCNPKGS BPVGCRSDGT CVCKPGFGGP 600 

KCERGAFSCP ACYNQVKXQM DQPKQQtiQRM EALISKAQGG DGWFDTEUB GRMQQABQAL 660 

QDILROAQIS E6ASRSL6LQ LAKVRSQENS YQSRLDDLKM TVERVRAL6S QYQNRVRDTB 720 

RLITQMQLSL AESEHSUSTt NIPASDEYVG PHGFKSLAQE ATRIAESHVB SASMMBQLTR 760 

ETEDYSKQAL SLVRKAI4IE3 V6S6SGSPD6 AWQGLVEKL EKTKSLAQQL TREATQAEZE 840 

ADRSYQHSLR hLDSVSBJJQG VSDQSFQVEB AXRIKQKADS LSTLVTREMD BFKRTQXNLG 900 

NHKEBAQQliL QK6KSGREK5 DQLLSRANLA KSRAQEALSM GJATFYEVES ILKNLREFDL 960 

QVDNBXAEAB EAKKRLSYIS QKVSDASDXT QQABRALGSA AADAQRAIQX9 AGEALBISSB 1020 

IEQBIGSU7L EANVTADGAL AMBKGLASIiK SEMRBVEGBL ERKELEFDm MDAVQMVZTE 1080 

AQKVZrrRAKN AGVTIQDTLN TLDGLLHLMD QPLSVDEEGL VLLBQKLSRA ICIX}IMSQLRP 1140 
MKSELEERAR QQRGBLELXtB TSID6ILADV KKLBZ^IRONL PPGCYMTQAL BQQ 

Seq ZD NO: 221 DKA sequence 
Nucleic Acid Accession #t NM_016S29 
Coding sequence i 13 -IB 54 

1 11 21 31 41 51 

I 1 1 I I 1 

GTCAAGAAAA GAATGTCTGT AATTSTTOGA ACTCCTTCAQ GACGACTTOG GCTTTACTGT 60 

AAAGGGGCTG ATAATGTGAT TTTTGAOAGA CTTTCAAAAG ACTCAAAATA TATGGAGGAA 120 

ACATTATGCC ATCT6GAATA CTTTGCCACO GAAGGCTTGC GGACTCTCTG TGTG6CTTAT 180 

6CTGATC7CT CTGAGAATGA 6TATQAGGA0 9GGCT6AAAG TCTATCAC36A AGCCAGCACC 240 

ATATTGAA6G ACAGAGCTCA AOQGnrTGGAA 6AGTGTTA0G AGATCATTQA GAAC2AATTT6 300 

CTQCTACTTG GAGCCACAGC C3VTAGAAGAT CGCCTTCAAG CAGGAGTTCC AGAAACCATC 360 

GCAACACTGT TGAAGGCAGA AATTAAAATA TG6GTGTTGA CAGGA6ACAA ACAAGAAACT 420 

GG6ATTAATA TAGGGTATTC CTGCOSATTG GTAT08CAGA ATATGGCCCT TATCCTATTG 480 

AAG6AGGACT CTTTGGATGC CAC3UIQ0GCA GCCATTACTC AGCACTQC3VC T6ACCTTGG6 540 

AATTT6CTGG GCAAOQAAAA TOAGGTGGCC CTCATCATOa ATGGCCACAC CCTGAAGTAC 600 

GOGCTCTCCT TOSAAGTCOQ GAGGAGTTTC CTGGATTTGG CACTCTCGTG CAAAGCGGTC 660 

ATATGCTGCA GAGTGTCTCC TCTGCAGAAG TCTGAGATAG TGGATGTGGT GAAGAA0CX3G 720 

GT6AAGGCCA TCACCCT06C CATOGGABAC G6GGCCAA0G ATGTCX3GGAT 6ATCCASACA 780 

GCCCA0GTG6 GTGTG G GAAT CMITGGGAAT GAAGGGATQC AGGCCACCAA CAACTCGGAT 840 

TA06CCATOG CACAOTTTTC CTACTTAGAG AASCITCT8T TGGTTCAT6G AGCCTG6AGC 900 

TACAACC3QGG TGACXAAGTG CATCTTGTAC TGCTTCTATA AGAAOSTGGT OCTGTATATT 960 

ATTGAGCTTT GGTTCGCCTT TGTTAATGGA TTTTCTGGGC AGATTTTATT TGAAOGTTGG 1020 

T6CAT0G6CC TGTACAATGT GATTTTCACC GCnTG C O G C CCTTCACTCT GGGAATCTTT 1080 

6AGAGGTCTT OGACTCAOQA GA6CATGCTC AQOTTTOCOC AGCTCTACAA AATCACCCAG 1140 

AATGGOSAAG GCTTCAACAC AAAGSTTTTC TGOaGTCACT GCATCAAOSC CTTG6TCCAC 1200 

TCCCTCATCC TCTTCTGGTT TCCCATGAAA GCTCTGGAGC AT6ATACTGT GTTTGACAGT 1260 

OGTCATGCTA CCGACTATTT ATTTGTTGGA AATATTGTTT ACACATATGT TQTTOTTACT 1320 

GTTTGTCTGA AAGCTGGTTT GGAGAOCACA GCTTGGACTA AATTCAGTCA TCTGGCTGTC 1380 

TG6GGAA6CA TGCTGACCTO GCTaSTGTTT TTTGQCATCT ACTOGACCAT CTGGCOCAOC 1440 

ATTCCCATTG CTCCAGATAT QAQAGGACAG 6CAACTAT0G TCCTGAGCTC CGCACACTTC 1500 

TGGTTGGOAT TATTTCTGGT TCCTACTGCC TGTTTGATTG AAGATGTGGC ATQGAGAGCA 1560 

GCCAAGCACA CCTGCAAAAA GACATTGCTG GAGGAGGTGC AGGAGCTGGA AACCAAGTCT 1S20 

OGAGTCCTGG GAAAAGOGGT GCT6CX3GGAT A6CAATGGAA AGAGGCT6AA 06AG0GCGAC 1680 

06CCTGATCA AGAGGCT06G OCGGAAGAOQ GOOO0SAQ6C TOTTOOGQaa GAOCTCCCTQ 1740 

CAGCAGGGOS TCCXX3CAT6G 6TATOCTTTT TCTCAAGAA8 AACAOGGAGC TGTTAGTCAG 1800 

GAAGAAGTCA TCCGTGCTTA TQACACCACC AAAAAGAAAT CCAGGAAGAA ATAAI^CATG 1860 

AATTTTCCTG ACTGATCTTA GGAAAGAGAT TCAGTTTGTT 6CACCCAGTG TTAACACATC 1920 

TTTQTCAGAO AAGACTGGOG TCXIAAGGCCA AAACACCAGG AAACACATTT CTGTGGCCTT 1980 

AGTTAAGCAG TTTGTTAGTT ACATATTCCC TCGCAAAOCT GGAGTSCAaA 0CACAGG8GA 2040 

AGCTATCTTT GCCCTCCCAA CTCX3TCT0CA GTGCTTAOCC TAACTTTTGT TTATGTOGTT 2100 

ATGAA6CATT CAACTGTGCT CTQTaAGOTC TCAAATTAAA AACATTATGT TTCACCAATA 2160 
AGAAAAAAAA AAAAAAA 

Seq ZD NO: 222 Protein sequence i 
Protein Accession #» NP_057613 

1 11 21 31 41 51 

I 1 1 t I I 

MSVZVRTPSG RLRLYCKGAD NVZPERLSKD SKYMEBTLCH LEYPATEGLR TLCVAYADLS 60 

EMHYEEWLKV YQEASTZIJO? RAQRLEECYE ZZEKNLLUiG ATAZEDRLQA GVPETZATLL 120 

KAEZKZWVLT GDKQETAZNZ GYSCRLVSQN MALZItLKEDS LDATRAAITQ HCTDLGNLL6 180 

KENDVALIZD GHTLKYALSF BVRRSFLDIA LSCKAVZCCR VSPLQKSEZV DWKKRVKAZ 240 

TLAZGDGAND VGMIQTAHVG VGZS07EQ4Q ATNNSDYAZA QPSYLSKLIiL VBGAHSYNRV 300 

TKCZLYCFYK NWLYlZEIiW PAFVN6PSGQ ZLPBRWCZGL YNVZFTALPP PTLGZFERSC 360 

TQESMLRFPQ LYKZTQNGEG FNTKVFHGHC ZNALVHSLZL FWPPMKALSa DTVFDSGKAT 420 

DYLPVGNTVY TYVWTVCLK AGLBTTAWTK PSHIAVWGSM LTWLVFPGZY STZWPTZPZA 480 

FDMRGOATMV LSSAHFHLGL PLVPTACLZB Z>VAHRAAKHT CKKTU^EEVQ ELETKSRVLG 540 

KAVLRDSNGR RZJ3ERDRLZK RLGRKTPPTL PRGSSLQQGV PH6YAFSQEB HGAVSQSEVZ 600 
BAYDTTXKKS RXK 

Seq ZD NO: 223 DKA sequence 
Nucleic Acid Accession S: BC017001 
Coding sequence: 1-394 

1 11 21 31 41 51 



275 
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I I I I I i 

AACS3CTGGGC AGG6COGGC3G CaOGGTOGGGG GGOGOCOGAG GGGCC0G6GC CGAGCGGCGG 60 

OGOGCAGGGC GGCAGCATCC ACTCGGGCCX5 CATCX3COG0G GTGCACAAOG TGCCGCTGAG 120 

CGTGCrCATC OGGCCGCTGC CGTCOGTGTT GGACCCOGCC AAGGTGCAGft GCCTCGTGGA IBO 

5 CACGATCOSG GAfiGACXXAG ACAGCGTGCC CCCCaVTOSAT GTCCTCTGGA TCAAAGGGGC 240 

CCAGGGAGGT GACTACTTCT ACTCCTTTGG GGGCTGCCAC CX3CTAOSOG6 CCTACC ftG^ 300 

ACTGCAGOGA GAGAOCATCC COGCCAAGCT TGTCCAGTCC ACTCTCTCAG AOCTAAGGGT 360 

GTACCTGGGA GCATCCACAC CAGACTTGCA GTAGCAGCCT CCTTGGCACC T6CTGCCACC 420 

TTCAAGAGCC CAiSAAGACAC ACCTGGCCTC CAQCAGGCTG GGCCATGCAG AAGGGATAGC 4 BO 

10 AGGGGT6CAT TCTCTTTGCA OCTGGCGAtSA GGGTCTGACT CTGGGCAOCC CTCTCACOSG 540 

CTACAAGGCC TTGGACTCAC TCTACAGTGT GGGA60CCCA GTTCCX3iOCT CT6TGACAAT 600 

AG6ATCATG0 CCTTACCCTT GAAGCATTAC OCS^SAAGGAO AACAGAGATG GGCTTGAA GA 660 

GGCACGTGCT GOC3G6CTCCA AATTCCCAAG GACAAGGATC CCTCTGCATT UTTGl'CTATG 720 

TAAOCTCTTA TATQGACTAC ATTCAGCTGC AAGGAAAGGA AAACCTTGAT TGCAGTGGTT 7B0 

15 TAAACAAACA GAAGATTGTT TTTCCACATA GCATGGATTC TGGA GATGGG T GGCTA ATGG 640 

TATTGGTTCA ACAACTCCAC GGAGGTAGGG GTCACGTCTT GGATCCTTTT GCCTTAATCT 900 

CAGTGCTCGT TACTTCATG6 TCCCAAGATG GCTGCTGTAT OCCCAAGAAT CATGTCTG06 960 

TTCMGGAAG GAGGGGTG6A GGAAGAGGAA GOGCCAAACT AGCTGGACCC GTCACXTTTCT 1020 

ATCAGAAA6T AAAACCTCGT CAGAAGTCT6 TTl'CCitjCl' C TCTCCCTCTG CATATCITCA 1080 

20 CTTAGATCCC CTTGGCCCGA GCCAGCTAOC ATTGCACCTC TAGCTGCAAA CAAAGCTAAG 1140 

ACAGCAGGGA ACAGAATTGT CATCGCTGAA TAGACOVATC GTGTTCCATC TACTGAGACT 1200 

GGCACACIGC CTCCTGCAAT AAAACTGGGA TCCCATTACC AAGAGAGAAA TGCAGAATTG 1260 

TGTACCAGTT AGCTTTTGCT GTGTAACAAA CCATOCCCAA ACTTOGCAGC TAGAAACAAA 1320 

CCCTGTATTT TGOCACAATC CTAT66GTTG GCAATTTGGG CTOGOCTCAA CftGGGCAGTT 1380 

25 CTGCroCTCA CACXTOGGAT CCCTCATGGA GCTAAGGTCA GCTGTTACCT CAGCTGGGCC 1440 

TGGATGGTCr AGGATAGCCT TACTCACTTQ CCTGGCAGGT GACAGGCTGT TGGCTGGAAT 1500 

TGCTTG6TTC TOCTCCATGT GGCCTCTCCA GCAGGCTAGC TCAG GCTTA T TCACATGATG 1560 

GCTTCAGGAT TGCAAAGAGA GTGAQRGTAG AAGCIGAAAG ACTTCTTGAO TTCTTGGCCT 1620 

G6AACTGGGA CTAGGACAGT GTCACTTCTG CTAAGTTCrr TTQQTCAGAe CAAATCACAA 1660 

30 GGCTTTACCC AGATTCAAOG 6ATGAGAAAC AGACTACATG TCTTGATGAG GGG AACCA CA 1740 

AAGAOCTTGT GGCCATTTTT CACCTATCAC AAATAATTTT GGATGGGTAT TTATTTGGAT 1800 

AAA06TATTT CCCTCTTCCC OCTTTCTCTC TGTCTCATGG GGCCTCACTC TGCCAAGTTG I860 

GAAOGCACTA AGACATTSTC CTOOCCCTCA GGGTCTAGGG GAAG AGGTGT TGGGGCAGGA 1920 

AGT6AGTCTC TOCATGQGCT QQAGOCACIG TAGTAGGAGT GCCTCCTTOT CTGCACPGCT 1980 

35 GGTATGGGGT TAGGCCAGGT AGGACATTCC AGAGGGGCTT CTGAAAACCA AGAGTCCCTG 2040 

GGGAAAGGGA ACAGAGTAAG GCAGGCCTTO TTCTCACTGC CCTCTAAGGG AACTTG GTCA 2100 

CTOOGCACTT TTAAGCCTCA GTTTCTCCAO TTCAATAATA AGGACAAGAG CTTTTCCCAT 2160 

GCATTCTCTT TCCCOGGQAA AGTTOACTGA GGTGA0CM3T AATAGAATTQ AAAAGGGAGA 2220 

GTCTCTTCAG TtSCAATGTGG OITCCTGGAT TGQGTCTTGG AACAAAAACA GGACATTAST 2280 

40 GGGAAAATTO GAAATCTGAA AAAAGTCTGA ATTTTAGTTA ATATACCAAT TTCAGTCTCT. 2340 

TGGTTTTGAC AGATGTACCA TX3GTX3ATGTA AGATGTTGAC CTTGGGGTAG GCTGGGTGAA 2400 

GOOTATACRO GAACTCTTTG TACTATCTCT QCAACTTCTC TGTAAATCTA GTATCATTCC 2460 

AAAATAAAAG TTTATTTAAT TTAAAAAAAA AAAAAAAAAA AA 

45 

Seq ID NOt 224 Protein sequence t 
Protein Accession #t AAH17001.1 

1 11 21 31 41 51 

50 1 1 I 1 1 I 

TliGRAGAGRQ APBGPGPSGG AQGGSIHSGR lAAVHNVPLS VLIRPLPSVL DPAKVQSLVD 60 

TIRBDPDSVP PIDVLWIKGA QGGDYFYSFG GCHRYAAYOQ LQRBTIPAKL VQSTLSDLRV 120 
YL6A8TFDLQ 

55 Seq ZD NOt 225 DNA sequence 

HUcleic Acid Accession #i HM_021048 
Coding sequence: 1..1110 



60 1 11 21 31 41 51 

11 I i i 1 

ATGCCTCX3AG CTCCAAAOCG TCAGCGCTGC ATGCCTGAAS AA6ATCTTCA AT OOCAA AGT 60 

QAGACACAGG GCCTCGAGGG TGCACAGGCT CCCCTGOCTG TGGAGGAGQA TGCTTCATCA 120 

TCCACTTCCA CCAGCTCCTC TTTTCCATCC TCTTTTCCCT CCTCCTCCTC TTCCTCCTCC 180 

65 TCCTCCTGCT ATCCTCTAAT ACCAAGCACC CX^^GAGGAGG TTTCTGCTGA TGATGAGACA 240 

CCAAATCCTC CCCAGAOTGC TCAGATAGCC TGCTCCTCCC CCTCGGTCGT TGCTTCCCTT 300 

CCATTAGATC AATCTGATGA GGGCTCCAGC AGCCAAAAGG AQGAGAGTCC AAGC ACCCTA 360 

CAGOTCCTGC CAGACAGTGA GTCTTTACCC AGAAGTOASA TAGATGAAAA GGTGACTGAT 420 

TTGGTGCAGT TTCTGCTCTT CAAGTATCAA ATGAAGQAQC OGATCACAAA GGCA6AAATA 480 

70 CTQGA6AST6 TCA2AAAAAA TTATGAAGAC CACTTOCCTT TGTTGTTTAG TGAAGCCTCC 540 

GAGTCCATGC TGCTOOTCTT TGGCATTGAT GTAAAGGAA6 TGGATCCCAC tGGCCACTCC 600 

rri ' G 'l'U CT 'l' G TCACCTCCCT GGOOCTCACC TATGATGGGA TGCTtyWSTGA TGTCCAGRGC 660 

ATGCCCAAGA CTG6CATTCT CATACTTATC CTAAGCATAA TCTTCATAGA GGGCTACTGC 720 

ACCCCTGAGG AGGTCATCTG GGAAGCACTG AATATGATGG GGCTGTATGA TGGGATGGAO 780 

75 CACCTCATTT ATGGGGAGCC C3U36AAGCTG CTCACCCAAG ATTGGGTGCA GGAAAACTAC 840 

CTGGAfiTACC GGCAQGTGCC TOOCAGTGAT CCTGCAOGGT ATGAGTTTCT GTGGG GTCCA 900 

AGGGCTCATG CTGAAATTAG GAAGATGAGT CTCCTGAAAT TTTTGQCCAA GGTAAATGGG 960 

AGTGATCCAA GATOCTTCCC ACTGTGGTAT GAQGAQGCTT TGAAAGATGA GGAAGAGAGA 1020 

GCCCAGGACA GAATT6CCAC CACAGATGAT ACTACTGOCA TGGCCAGTGC AAGTrCTAOC 1080 

80 OCTACaGCTA GCTTCTCCTA CCCTQAAIAA 

Seq ID HOt 226 Protein sequences 
Protein Accession fi: NP_066386 

85 1 11 21 31 41 51 

HPRAPKRQRC iffEEDLQSQS BTQ6LEGAQA PIiAVBBDASS 8TSTSSSFPS SFPSSSSSSS 60 
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WO 02/086443 

SSCyPLIPST PBBVSM3SET FNPPQSAQZA CSSFSWASL 
CVLFDSBSLP RSBIOBKVTD LVQFLLFKXQ NKEPITKAEI 

EOOjLVFGID VKEVDPTGHS FVLVTSIiGLT YDCSMLSDVQS 
TPEEVIHEAL KMKGLyiXa>lB HLIYGEPRKL LT(^3WVQSTY 
RAHAEIRKMS LLKFLAKVMG SDPRSFPLWY BBALKDEEER 
ATGSPSYPE 

Seq ID IiK>: 227 DNA sequence 

Hucleic Acid Accession 8i MM_00S025.1 

Coding sequence* 82 •1314 



PCT/US02/12476 



PLDQSDB6S8 8QXEBSPSTL 
LBSVZKHyED BFPLLFSEAS 
MPKTGILILI LSIIPIBGYC 
LEYRQVPGSD PARYEPLWGP 
AQDRIATTDD TTAMASASSS 



GCGGAGCACA 
GAGGCTTGAA 
AGTATGGCTA 
TATAATOGTC 
GCTCTTGCAA 
CACTCAATXSG 
TCAAACATG6 
6T6CAAAAT0 
6CA6CAGTAA 
T6GGTGGA6A 
GCTGCCACTT 
TTTAGGCCT6 
ATTCCAATQA 
GAAGCTQOTO 
ATGCTGGTGC 
CAGCTGGTTG 
AGGTTCACAG 
GAAATTTTCA 
TCCAAA6CAA 
GTCTCAGGAA 
CATCCATTTT 
GTCATGCATC 
TTATTTGAAT 
TA6GATTTGT 
AATATATGTA 
TGTTATGTCA 



11 
I 

GTCCGCOGAG 
ACTGTTACAA 
CAGGGCSOCAC 
TTAQAGCCAC 
TGGGAATGAT 
GATATGACAG 
TAACTGCTAA 
6ATTTCATGT 
ATCAT0T6QA 
ATAACACAAA 
ATCTGGCCCT 
AAAATACTAG 
TGTATCAGCA 
GTATCTACCA 
TCTCCAGACA 
AA6AATGGGC 
TGGAACASGA 
TCAAAGATGC 
TTCACAAGTC 
TGATTGCAAT 
TCTTTCTTAT 
CXOAAACAAT 
AACAAG6AAA 
GTTTTACAGT 
AATTATAAGT 



21 

I 

C3VCAAGCTCC 
TATGGCTTTC 
TTTCCCTGAQ 
T6GTGAAGAT 
GGAACTTGGG 
CCTAAAAAAT 
AGASAGCCAA 
CAATGAGGAS 
CTTCAGTCAA 
CAATCTGGTG 
CATTAATGCT 

AAocrrrrcT 

AOGAQAATTT 
AGTCCTA6AA 
GOAAGTTCCT 
AAACTCTGTG 
AATTGATTTA 
AAATTTGACA 
CTTCCTAGAG 
TAGTAOGATG 
CAGAAACAOO 
GAACACAAGT 
ACAGTAACTA 
ATATCTTAAQ 
AACTTGTCAA 
GTGCTGTTGT 



31 
1 

AGCATCCCGT 
CTTGGACTCT 
GAAGGCATTG 
GAAAATATTC 
GCCCAAGGAT 
GGT6AAGAAT 
TATGTGATGA 
TTTTT6CAAA 
AATGTA6006 
AT^AGATTTGG 
GTCTATTTCA 
TTCACTAAAG 
TATTATGQGQ 
ATACCATATO 
CTTGCTACTC 
AAGAAGCAAA 
AAAGATGTTT 
GGCCTCTCTG 
GTTAATQAAG 
GCTGTGCTGT 
AGAACTGGTA 
GGACAT6ATT 
A6CACATTAT 
ATAATATTTA 
GGAATGTTAT 
TTAAAA3AAA 



41 

1 

CA6GGGTTGC 
TCTCTTTGCT 
CnSlCTTGTC 
TCTTCTCTOC 
CTACXX3^GAA 
TTTCTTTCTT 
AAAITGCCAA 
TQATOAAAAA 
TG6CCAACTA 
TATCCCX3W3 
A6GGGAACTG 
ATGAT6AAAG 
AATTTAOTQA 
AAGGAC3ATGA 
TGGAGCCATT 
AAGTAGAAGT 
TGAAGGCTCT 
ATAATAAGGA 
AAG6CTCAGA 
ATCCTCAAGT 
CAATTCTATT 
TCGAAGAACT 
GTTTGCAACT 
AAA7AGTTCC 
CA6TATTAAG 
AGTACCTATT 



Seq ID KO: 226 Protein sequence i 
Protein Accession Si NP 005016.1 



I 

MAFLGLPSIiL 
ELGAQGSTQK 
NEBFLQMMKK 
INAVYFKK^ 
VLEIPYEGDE 
IDLKDVLKAL 
SKMAVLYPQV 



11 
I 

VLQSMATGAT 
EIRHSMGYDS 
YFNMVNHVD 
KSQFRPENTR 
ISMMLVLSRQ 
GITEIPIKDA 
IVDHPPFFLI 



21 
I 

PPEEAIADLS 
IjKNGEEFSPL 
FSQNVAVANY 
TFSFTKDDE8 
EVPLATLEPL 
NXjTGLSDKKE 
RNRRTGTILF 



31 

I 

VNMYNRLRAT 
KEFSNMVTAK 
INRWVENNIM 
EVQZPMMYQQ 
VKAQliVEEWA 
IFLSKAIHKS 
MGRVMEPBTM 



41 

1 

GEDEHILFSP 

esqyvmkian 
nlvkdlvspr 
qepyygefsd 
nsvkkqkvev 

FLEVNEEGSE 
NTSGHDFEBL 



Seq ID KOt 229 SNA sequence 
Nucleic Acid Accession ftt 1)M_003695 
Coding sequence: 12-398 



GGACATCAQA 
CAGCCCTTAC 
TCTGCCOOGC 
ATCTG6TGAA 
TCA6CA60BG 
ACAACX3CT6C 
TGAGCCTCCT 
TCAT6CCTTT 
OGGTGCCAGO 
CATG6AAT6C 
ACAGAGGATG 
GATTTCACAC 
TAAATGATTT 



11 
I 

QATGAGGACA 
CCTGOGCTGC 
CAGCTCTCGC 
GAAGGACTGT 
CACCAGCTCC 
ACXXy^CCOOC 
GGCOGTCATC 

cxnrcccTTT 

AQCCCCAG6C 
TGATGACTTG 
CAGCCCCCAG 
TCCTTCTCTT 
AAACC 



21 
I 

GCATT6CTGC 
CAOGTGTGCA 
TTCTGCAAGA 
GOGGAGTOGT 
ACCCAGTGCT 
AC06CCCT0G 
TTAGCCCCCa 
CTCTGGGGAT 
T6AGG6CTTC 

CTGCATGGAA 
TlW l MiXt i 't 



31 
I 

TCCTTX3CAGC 
CCAGCTCCAG 
CCACGAACAC 
GCACACCCAS 
6CCAGGAG6A 
CCCACAGT6C 
GCCTGTGACC 
TCCACACCTC 
CCOGAAAGTC 
CACAQACOCC 
6GTGGAGGAC 
TTATTTTGTA 



41 

I 

CCTX3GCTGTG 
CAACTGCAAG 
AGTGGAGCCT 
CTACACCCTQ 
CCT6T0CAAT 
CCTCAGCCTG 
TTCCXXXXSUS 
TCTTCCCCA6 
TGGGACCAG6 
ACAOAGGATG 
AGAAOCCCTG 
CTCAAATCTC 



SI 
I 

AGGTGTGTGG 
GGTTCT6CAA 
AGTGAATAT6 
ATT6AGTATT 
AGAAATCOGC 
GAAGGAGTTT 
TTCCTTGTTT 
AXATTTTAAT 
CATCAATAAG 
GGATTTTGAT 
GAAGTOGCAG 
TGAAGTCCAA 
7G6CTCCAAT 
AATAAGCAT6 
AGTCAAAGCA 
ATACCTGCCC 
TGGAATAACT 
GATTTTTCTT 
AGCTGCTGCT 

tattgtog;^ 
catgggacga 

TTAAGTTACT 
GGTATATATT 
AGATAAAAAC 
CTAATGGTCC 
GAACATGTG 



51 
I 

LSIALAMOW 
SZiFVOHGFEV 
OFDAATYXiAL 
GSNEAOGIVQ 
YLFRFTVBQB 
AAAVSGMIAI 



51 

1 

GCTACAGGGC 
CATTCTGTGG 
CT6AGGGGGA 
CAAG6CCAGG 
GAOAAGCTGC 
GOGCTOGCCC 
6GAAGGCCCC 
CCGGCAACGG 
TOCAGGTGGG 
AAGOCACCGC 
TQGATCCOOQ 
TACATGGAGA 



Seq ID NO I 230 Protein sequence* 
Protein Accession 9: NP_003686 

1 11 21 31 41 51 

I I I I I i 

HRTALLLLAA LAVATGPALT LRCHVCTSSS KCKHSWCPA SSRPCKTTNT VEPLSGNLVK 
KDCAESCTPS yTLQOQfVSSO TSSTQCOQB) IiQIEKIiHKAA PTRTAliAHSA LSLSIALSLL 
AVUAPSL 



Seq ID NOt 231 SNA sequence 

Nucleic Acid Accession fii Eos sequence 

Coding sequence: 126-752 



120 
180 

240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



60 
120 
180 
240 
300 
360 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 
120 
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WO 02/086443 

1 11 21 31 41 51 

1 I i 1 I I 

C06GGCAGGT GGCTCATGCT CGG6AG0GT6 GTTGAGOGGC TGGUUOGGUT OTCCTGGAGC 
AGGGGOGCAG GAATTCTGAT GTGMVACIMl CAGTCIGIGH Q0C3CTGQhAC CTOC AC l'C a Q 
AGAAGATGAA GGATATOGAC ATAOQAAAAG AGTATATCAT CCOCAGTCCT 66G7ATAGAA 
GTGTGAGGGA GAGAACCAGC ACXTCT6GGA C6CACA6AQA CCX3TGAAGAT TCCAAGTTCA 
GGAGAACTOG AC0GTT6GAA T6CCAAGATG CCTTGGAAAC AGCAGCCCGA 6C06AGGGCC 
TCTCTCTTGA TGCCTCCATG CATTCTCAGC TC3U5AATCCT GGATGAGGAG CATCTCAAGG 
GAAAGTACX3^ TCATGGCTTG AGTGCTCTGA AGCCCATCCG GACTACTTCC AAACACCAGC 
ACCCAGTGGA CAATGCTGGG cm ' mxxri ' GTATGACTTT TTOGTGGCTT TCTTCTCTGG 
COOGItSTGGC CCACAAGAAG GGGGA6C7CT CAATGGAAGA CGTGT66TCT CTGTCCAAGC 
AC8AGTCTTC T6A06T6AAC TGCAGAAGAC TAGAGAGACT GTGGCAAGAA GAGCTGAATG 
AAGTTGOQCC AGAOGCTGCT TCCCTGOGAA GGGTTGTGTG GATCTTCPGC OGCACCAGGC 
TCAKrTGTC CATOGTGTGC CTGATGATCA OGCAGCTGGC TGGCTTCAGT GGACCAAATT 
TTCAQGATGG CTGTATTCTG CGGTCAGAAT GAGAGAGTCA AGCTGGGCAG AATCTCTG6C 
CAAGAQTTCA GCCTTCCTTT GGAOACTGCT CCATCAGTGC OGftGGTGTGT GGGAACAGGC 
TTCACTGCAC 06CCATCTTA CTOAGTTGCT TCAOGTGAGG AAAAOGGGGC TTTGGCCCTG 
TGACTCAGTT CCACATTTTG GATTGCATAC TGQAAAAGAA GCCAATCTTC TT6CTAGTAA 
ACCAGCAACC CGQCTGTATA CAGTGGTGAC CCAAGCAATG GATATAAACC TAAAAATCTG 
AGGQAGGGGA GAQGTGGAAT ACAGTAGTTC TTGGAATCTO AAGTCTCCTA TTTGATCAG6 
TT A 'lTfOCiXS GGACTTG6CA AAAATCTGAT T0GT6GGGAT CTCCTAGGAC CTAGTGGACA 
TCTGGTATTA ATTTAATCTC AGOAAAAACA AGAAATTAAC CCAGAGAGAG TCTGGGTTTT 
GGAATTCAGC GTA6CTACCT CCAGAC O GT G 6TOTCTG6CC TCCATTTTTG TCTGTCATTC 
AGCTCTGACT TACAGCTGCA GTCACCTTTG CTATAAGGCA CCTGGGTAGA AGGGTGGATG 
GGCTTCACAT CAATTTTTTT CTTCCTTTAG GGTGGGGGAT TGGTTTGGCT TTCTTTTGTT 

G i mflT r m * gttttatttt tgtcaagatt gatttttaga tgcaaggact tgaaaagacc 

CAGAAGGATG CCACCAGTTT TTCCTTGA6G CCTAGGATTT TTTATTCZGT COOGAGCAGA 
GGTAATTCCT CACAACTTAG TGCACCAOTA 6CACCAGCCA TTTPGAGCAO A6TACCTCTT 
TGGGGAGCTT TTOOTTTTGT TTTGTTTTTA ATTCTCTTTC CTTAGCAGCA AGGTCTTTTT 
TCCTAGAGAA TCTACTCOGT TGCAGAATCA TTGCAACCTC AGGAGCCCTC ACTGATTGAG 
TGCTGTCA6C CTGATATACT ACTTTGGACT CTG6AAACAG ATATGGGTTC TATTCrCTAT 
TTCTACTGTG TGTGC3TTAAA CAACOGTCGG AGAGCAGATG ACCTGTTAGA TGGCTAOTCC 
TGTATAACTC GACTCTGTAT QTTTCftAXGT ATGTTACTGC AATGCTTCAC CIGCTGTACA 
GTGTTTGTGA QATQCTCTTT GAAOATGGTA CTTTTATATT T 

Seq ID HO: 232 Pxoteln sequence: 
Protein Accession #: Eos sequence 

21 



PCT/US02/12476 



31 



41 



51 



1 11 

I I I I I I 

MKDIDIGKEY IIPSPGYRSV RERTSTSGTH RDREDSKFRR TRPLEOQDAL ETAARAE6LS 
LDASMHSQLR XIDBBSPKGK VHEGLSALKP IRTTSKHQHP VDUAGLPS04 TFSWLSSIAR 
VARKKGELSN EDVHSLSKBE SSDVNCRRXiB RLNQERUnSV OPDAASLRKV VWIFCRTRLI 
LSZVCUUTQ LAGPSGPNFQ DGCZIiRSB 

Seq XO NO: 233 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



TTTTAATGGT 
CACAACATGA 
TGTTTCTTAG 
TAGTAAACTT 
TTTGTTTCCA 
TOTTTGACTG 
TT0TTAT6GT 
AA6CTATTTT 
TATATTATGG 
ATTAGCCTAC 
AATAGGTTAA 
AATTAAAGTA 
TATGAAAGCT 
TATATATTTT 
TATCCTGTAT 
TAGAAGACCC 
TTTTTTTTAG 
GATTCTGTTA 
TCCAGATAAA 
CTTAGGCTCT 
CCATAGTATC 
ACTATAATA6 
TATCAATGTT 
TACATAAAAA 
AAGTCATTAA 
AGAATCATCT 
GGAA6AIGGA 
QAOTTGAAGA 
CTTTTAAAAC 
TATTCTTCTA 
AGCCCTTGTG 
CTGTTCAGGT 
TGAGA7GGA0 
ATCATA306A 
TTTTAATTCT 
CAATOCCCAA 
TTTCTGtSUlT 



11 
I 

GCTCATATAT 
ATCACATAAT 
ATOGTAGTCA 
CTCTTTCATC 
AAGTCACAAT 
AAAOCAAAAC 
ATATCTTTTT 
AATCATCAAG 
ATTAACCAQA 
TATAGTATTA 
AAAGTAGTTA 
CA1TTTAAAT 
CTGGCTATCA 
TTAGTTCCTT 
TTTTTTTAAG 
ACTCTTACTA 
GTA6TTTAAA 
AGCATCCAAA 
AGAT6GA6AA 
CTCCATGTAT 
AAGTQGAGGG 
AAGTTTGAQT 
ATCCAATGAT 
GT6CTCATGT 
TAATTTAATA 
GAGGACTTTT 
GCCATGCTTQ 
CCACTGCTCT 
TAGTAATGTA 
ATAACTAGCA 
TAAGATTATT 
0A6AAAACAT 
GAG6TQGGCA 
AGTTT6GAAA 
CACTGTCTCA 
GGGGCAGTGT 
TGCAOCTACA 



21 

1 

ACTGTATTTT 
CATGATTTTT 
TTGAGAAGTC 
TTTGTGTTAO 
TQAATTATTC 
AAC6TGACAG 
ATTAAATATT 
TATGGAAAAC 
ATTGTATCAT 
ATAAATTTTT 
CAAATTAAAC 
QAGCTTTATA 
TCCTG6GATA 
TGAGATAACT 
AATTGTTTTA 
GGTTCCCTAA 
GCAAGCACTG 
AACAATGCCr 
TACCTCAT6T 
CTTTCTTAAG 
TAGTTCAGAA 
AATATTTTAA 
TTTTATTAAA 
ATTT6AATTT 
ATTGTTTTAA 
AATATGGAAT 
TTTTTCCAAA 
AAATTAGTGC 
CCCAGTTAAG 
TTTATTACAT 
ATTTCTTCTC 
AAIG6ATTTT 
CATTTAAOGT 
AGAGAGCTTA 
AAAGAGAATC 
TACCTTACTC 
TGTTTTCTTA 



31 

I 

TTGTTGTTTA 
TTTTTTTACT 
CCAATPACTC 
CrCTGTAGTC 
TTAGATACCT 
TTTATTTTCA 
TATTTTGACT 

aaattactat 
ttttggccta 

CAGTTGGTTT 
TTACTAATTT 
ATACCTTAAA 
GTAATTTCTA 
AATTTCTAAT 
TAAATAGGTC 
GGATCTGCCA 
ATACCAGTGG 
AATTTCAGTT 
ACTGTGACTT 
GAAAAGTTTC 
AAGTTAATAG 
TAAATTTATA 
AAATTAOCTT 
TAAATAATTT 
ATCA6TGGTT 
CCACCTCATA 
AGCTCTTTGA 
AGGAAAATGC 
TTTTGATGGT 
GAAATTTAAG 
TATAACTTCA 

CAGTTCACTA 
TGACAGGT7T 
AGCTCTCCAG 
CTTCACTGCT 
TTAACATTCA 



41 

I 

GTTTTACTTA 
TTTACTCCCX! 
TAAACTTTTO 
TTAACCTGGA 
TAAGCCACTG 
AACACTAACT 
AAGCTTTCAT 
TGCATTTTCC 
ATGTCTGGAT 
GOGCAAATTT 
ATACCTGATT 
AAGTTGGTTC 
ATTATATAGT 
TATATATGTT 
ATAAQATACA 
TAGATTTTTT 
GAGTTGGTCT 
CTTAGGTTAT 
GAAAATGAAT 
TGAGTGTGAT 
GAAATCTTTT 
TAATTCAAAT 
ATTATXAGAA 
ATT7AAATCA 
TTCAACCCTC 
ACAATTAAGT 
GTGATTCTAA 
TTTTATTTCI 
TTAAATTCGA 
AGTTTAAGTT 
AAATAGATAT 
CTCTGGAGCT 
A0CIAT6GTT 
GTAITGCrrGGT 
CAGTTCTAGA 
TCTTAGAAGG 
GAATTGGGAA 



51 
I 

1TGA6AGTGT 
AAATTATTCA 
AOTTATAACG 
TTTTAATTTT 
AATTCAGTTC 
TCTTGATATT 
AAAATATTT6 
TATATATGCA 
ATAAAAGATA 
AAACCTGAAA 
TTTTTTCTTG 
TAATTTAAAA 
ATTTCAAAAC 
TCAAAAACCA 
AGGTCT6CAT 

• rrmiTi TT 

TGATCTAGGA 
GGCTTGTGAC 
TCTTAAAATT 
CTCTCTTTTG 
GTGACAGCAG 
GATAAAAAT6 
CTGTGCCTAT 
AGACCACCAT 
ACTTCATATT 
CTAAATTTCT 
TTTGTAGTCA 
CCCATGTTAA 
CTAAA6AACA 
CCATCAAACT 
TTCATTCAAA 
GCCIGTTCAG 
CAGAGTTCTG 
GAAT6GATAG 
AAAGCT7TGA 
TAGAATTAAG 
TATTAAITTT 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320' 
1360 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
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PCTAJS02/12476 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



TCCAGTGAGT 
CAATTTTGTG 
AWVACCTCAT 
TTTAAAGATG 
CTTCAGAAAT 
AQATGGTATT 
TT6GAAGC0Q 
ATTTTATATT 
CTTCITAAAA 
ATAGAGATTC 
AAGACATTAA 
CCACATTAAA 
TGT60CTGCT 
CCTTCATCAA 
ACAACATCTG 
OXCRAACAC 
CACACAACCA 
GACAACACAT 



AGTTTTCTGA 
TTTGTrrACT 
GCCTTTTCAT 
CTTTAATGAA 
CCATATATTT 
TAAAATGAAT 
CCA60CATTC 
ACTATGGTAT 
CCATAACCTG 
TTCTTTTATG 
CATAAGTCTC 
CAACCACX3GC 
ATOGOCTCTG 
6CACTT6CCA 
CAACTCTACC 
AAAACCACTA 
ACACACCAOG 
CACATACACT 



AATTG6TAAC 
TTTAT6TAAA 
TACATCTAAT 
AAGTATTAAG 
GTCATATTTA 
GC»CAAAAAT 
ATGTAGAGAG 
CTGTGTACCA 
GCTTGCCTTT 
AASftAGAGCT 
TGAGCAGTGA 
AACACTCAGA 
GCATAACTTA 
ACACATTCAC 
CTATCAACTO 
AATCATAACC 
ACXAAACACC 
CACTACCCCC 



TTGGAGAGTA 
AATTTGATAT 
TTQAACTCTC 
AAAATATATA 
TTTTTTTAGA 
ATCTTGTACC 
TTTATAAGAA 
TATTTCTAAG 
TAGTGTTAAA 
6A0GTAATTT 
TACATTTTCA 
CTTG6CACTT 
CACGAATOGT 
CTCTAACTTG 
CCAACCTAAA 
ACCACACA06 
CCACCACAAA 
CiCATACICOC 



Seq ZD KO: 234 DNA sequence 

Nucleic Acid Accession Eos sequence 

Coding sequence i 27-261 



AGCAQGAGGA 
CrCTACGGOG 
TTCCTGCCCC 
GG06CTCACT 
CAGGGTTC06 
SAGTTTTTCT 
CASAAAGAAT 
ATGTTGCAGG 

■ nvi ' cn ' ci ' C 

ATAAAAACTG 
GATGCCAGGA 
TCAAGCCAAS 
CTCAAA0CC8 
QGTCAGAACA 
AGACAGCXrrO 
TCTCAGTATC 
CATAAACACA 
GGAAAG6TCT 
TCTTTTGATO 
GCAAAGAAAA 
TTTATTTTTA 
TTTGACCTTG 
QATTAAAACA 
AAATCTCAAO 
AAAAAAAAGC 
GCACTTCTCT 
TCCCATCAAA 
ATAATCCC 



11 

I 

GAGCTGGCGG 
TA6CAGTTAC 
OTGCTCATTT 
QAAACAST6T 
ACCAATCCAA 
TTGCTCTGAT 
GCAAG6AGA7 
AAOAGCTAGT 
CACTTQ6CAT 
TTCAGCGGTT 
6GAAAGATGC 
OCAAC3U36TG 
GGGAAGCCCA 
CAGCTAA8CA 
TGA06TTTCA 
TTACGCCCAG 
TAACAGCA6C 
0CTGT6ACTG 
ACTCTATATC 
ATAAAAGACA 
AACTGAATTC 
AAATAATCTT 
TATTAGTAAT 
GCTTTTAAAO 
OCTCCATCTQ 
TCTCATTTTC 
GCCAAAGAAA 



21 

I 

GAAGACATGC 
ATCAGACT6A 
QOQGCTGACO 
GTTGCTCCAC 
GAGCCTTGCA 
CTTGGAGACA 
AGACCAAOGT 
CTTTCASGCT 
ATCAAOAGCC 
CXSCCAAC3^ 
CAGGGGTAAA 
TTCTGTTTTT 
CTCTA GAACC 
GATOGCTTGG 

aaagcaaaa6 
tgacacgatc 
agca ataatt 
ttttattttt 

CAACTCTGAG 
ATTTCCAGTA 
AGCAGAGATT 
TACATTGTAA 
TAATTATTAA 
CATTTGrTACA 
ATTCrGATlT 
CACTGTCTOG 
GAAAAGAAAA 



31 
1 

ACCCCTTGAA 
GACACTTCCT 
GCATTTTAGG 
ACOGCCTTGT 
GAAA6CATTA 
TCCCTCTGCC 
GAGATTCTCC 
GOQCTGGTGA 
AGG0GT66AA 
AA6TGGTAAA 
GTGGGAAAAT 
CATCACAGAA 
CATGCTGGTC 
GTCATCAGGA 
TCCCCTACCA 
TACXXrrCAAA 
AAA8ATGAGA 



GTTTGATTAA 
AGTATGCCAG 
TACATGCATT 
ATTCTTAATG 
AGGAOAATAA 
AATGACTGGA 
TCATTGTCAG 
CAA6CTAGAA 
TTGTTCTGTA 



AAATAAGGTA 
GTGAATtACA 
AACTTCAGTG 
GATTTGTATG 
AACCTCCTAA 
TTTGTOCAAA 
AATAATTTAA 
TATTCATTAT 
CACAAAATCC 
ATTACCAGTG 
AACAT6AAGA 
TCCIAOGAAT 
CCTCCCTACT 
TACAACCTTA 
GACCCCCAAC 
OCACACACCA 
CAAGCTAACA 
AOOCACCA ' 



41 

1 

GACCCAGAGA 
GTTTACAGGA 
CCTCAGCCCA 
TTTGCTTGTT 
ACOTGCTTTT 
TAGTGGAAAC 
TTCATGCACT 
CCTGAGAAAO 
GACTAAAACA 
GTAGCAAAAA 
GGGAACCTGA 
CTAA7AAGTG 
ATCCATATOC 
OGTCCATTAC 
6CCA6TGAAQ 
ACTTAAAAAA 
TGAGAACAAT 
AGA6GAAQAA 
A6AAAT6ACC 
TTCGAATTAA 
ACGATGAITA 
ATCAAAAC3UV 
TTGCAAATAC 
CATTTTTTAA 
T6C3VACAACA 
ATTCrCAOGA 
CAGATATATG 



CAGTTCTAAT 
CCAGAAGTGC 
TGAGTTTATA 
TTGGATAACT 
AGTTTATCTG 
AATTGTAtGC 
TAAATTGGTA 
AACATTGTAT 
CATCTGCACA 
GZGACAAOCA 
CCATCCTATA 
TGTCTAOOCT 
CCAACTCACC 
ACAACACAAC 
CACAOOCAOC 
ACCACAAACA 



2280 
2340 
2400 
2460 
2520 
2560 
2640 
2700 
2760 
2820 
2860 
2940 
3000 
3060 
3120 
3180 
3240 



51 
I 

GAGGCCGTCT 
GACTATAAAA 
TCTGCACCCA 
GGCOajCTCT 
CTCTTTGGCA 
ATAAGGAATA 
CAAGAGAAAG 
AATGTCCAGC 
GGAAATGTTT 
TGGGGATGGA 
AGCCAGGAGG 
GTGCTGAGGA 
CCAAGGCCCT 
ATCCAAAOGA 
CTACCTGATT 
AAAAGGGAAA 
TAAGAA AAAA 
6AATGATTTT 
TTGAACCACA 
TGATTTACTT 
ACATCTGAAA 
GGTTCTCAGT 
AACATTCCTA 
ATTT6AAAAA 
AAAAAGGTAT 
CTACCTTT6A 
ACATTAAAAA 



Seq ID NO J 235 Protein sequence: 
Protein Accession #: Eos sequence 



1 11 21 31 41 51 

I I I I I I 

KHPLKTQREA VCLPRSSYIR LRHFLFTGDY KXPAPCSFGA DAZL6LSPSA PRRSXiKQCVA 
PHRLVLLVGA XiSOPRPIQEP CRKB 

Seq ID NOt 336 DMA sequence 
Nucleic Acid Accession #t 2m_002075 
Ooding sequence s 406 143 8 



CCACAATAGG 
ACAGGATCAG 
AGTCCTTTCT 
ACCACCCTGA 
GGOCAGGCCA 
OGTOGCAGCT 
CCA6CCAGAG 
CAACTGOGTC 
GCTGA06TTA 
08GA0G0GGC 
GATTCTAAGC 
ACCACCAACA 
6CCCCATCA0 
CTCAAATCOC 
CTCTCCTGCT 
TGTOCCTTGT 
GACTGCATGA 
6CCAGTGOCA 



11 
I 

GGCAGACCTG 
ACCCAGAGGC 
AATCTCAGCT 
6CT6A0GAGC 
GGOCAGCTCC 
6AGGGAGTAA 
CCCAAGAGCC 
A6GAAGCGQA 
CTCTGGCAGA 
GGAOGTTAAG 
TGCTGGTAAG 
AGGTGCA06C 
6GAACTTTGT 
GTGAGGGCAA 
GCCGCTTCCT 
GGGACATTGA 
GCCTGGCTGT 
AGCTCTGGGA 



21 

1 

TCCATCCTTC 
AGCTGGTT6G 
CCTQCCTGTA 
ACAGTTTGAG 
TCTGGCAGCA 
GGAGGCTCCC 
AGAGTGACOC 
GCAGCTCAAG 
GCTGGTGTCT 
GGGACACCTG 
TGCCTOGCAA 
CATCCCACTG 
GGCATGTG6G 
TGTCAAGGTC 
GGAT6ACAAC 
GACTGGGCAG 
GTCTC C TGAC 
TGTQOQAGAG 



31 
I 

TCTGTGGGTC 
GGTTTGTCGA 
CCCTCCCATA 
GCCCCCCCAA 
6AGCCTG66C 
AGGAACCGGA 
CTOGACCTGT 
AAGCA6ATTG 
GGCCTASAGG 
GCCAAGATTT 
GATGGGAAGC 
OGCTCCTOCT 
GGGCTGGACA 
AGCC6GGAGC 
AATATTGTGA 
CAGAA6ACTG 
TTCAATCTCT 
GGGA0CT6CC 



41 

I 

CCCTGTACCT 
GAAGAAGGAT 
CTCACCAAAC 
OCCCOCGCOG 
AGGTGA0GG6 
GCTGGAAACC 
CAGCCATGGG 
CAGATGCCAG 
TQGXGGGACG 
A0GCCAT6CA 
T6AT0GTGT6 
GGGTCATGAC 

AomsTGrrc 

■i TTCl OC rC A 
CCAGCTGOGO 
TATTTGTGGO 
TCATTTOGGG 
GTCAGACTTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 



60 



51 
I 

TTCTCCCCCA 
TATCCAGATC 
CCTCTTCCCC 
GTCG6GGCCA 
CGGG0G06GG 
CG6CX3GAGGT 
GGA6ATGGA0 
GAAAGCCTGT 
AGTGCAGAT6 
CTGGGCCACT 
6GACAGCTAC 
CTGTGCCTAT 
CATCTACAAC 
CACAGGTTAT 
GQACACCACG 
ACACAGGGGT 
GGCCTGTGAT 
CACTG6CCAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
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GAGTOSGACA TCAAOSCCAT 
GATGACGCTT CCTGCCGCTT 
CAOGAGAGCA TCATCTX306G 
TTOGCTOGCT AOGAOSACTT 
GGCATCCTCT CTGGCOVOGA 
GCTGTGGCCA CAGGTTCCT6 
AA0GGAAG7G GAAGGCAGT6 
GGTGTTCTCT TCTATATTCC 
GGGAGCATG6 GACTGTGCCT 
CCATCTCCTC CX3VTGGCCTT 
GGACAACCTG CCCCTCCCCA 
GCCCTAGGAT TCCTCCCCCA 
TTGGCCCTGT GACTATGGCT 
TTCTCCTTTT TCTACCTTTT 
GT 

Seq ID NO; 237 Protein sequence: 
Protein Accession ft: NP^002066 

1 11 21 31 41 51 

1 I I I I I 

MGEMBQLRQE AEQIiKKQZAD ARKACA0VTL ABIiVSGLEW GRVQMRTRRT laRGHLAKIYA 60 

MHKATDSKIaL VSASQXX3KLZ VMDSYTOIKV HAIPLRSSWV I1TCAYAPSC5H FVAOGGLDNM 120 

CSXYMIiKSRE GNVKVSRELS AHTGVLSCCR PLDDNNIVTS SGDTTCALWD lETGQQKTVP 160 

VGHTODCMSL AVSPDFTfLFI SGACDASAKL WDVREGTCRQ TPTaffiSDIH AZCPFFNGEA 240 

ICTGSDDASC RLFDIiRADQB LICFSHESII OGITSVAFSL SGRLLFAGYD DFNOIVIIDSM 300 
KSERVGIIiSG H0NRV9CLGV.TADGMAVATG SWDSFLKIHH 

Seq ID NO: 238 DNA sequence 

Nucleic Acid Accession «t CAT cluster 

1 11 21 31 41 51 

1 I I 1 I t 

TCCCAAT5T0 TKGAACCTAC CATAAATTCT TTTCTTAOJO OACAATCTTA TNCTAAHCAA 60 

TACCATTPGC TTTTAAGGCA GATAATCCTC CAAGTTTTCT AATGATATCT GAAACTATTA 120 

ACTGATTCTO T6AATTATGA AATCTGAAAA GGAATTGGAA GTTGCTAAAA ATCTATCATT 180 

TCCATTGACC AGTGTGAAGC ACAGTGGAAT GAGAATGOGT GCCCTGACAC CAA AGAAA AA 240 

TAAGTGACTO GAAA6CTGAA GAATCAOOGO CTTCAfiTGAC ATG6AA00CA GT6ATTTGAT 300 

TTTTGACX3V0 TATOOGOTGA CTTTGAGGTG GTCAAGAAAC CACACTTTAA GAACAATGTC 360 

CAAAAAGGGG AAAAAAAAGA GCAACCAAAG AAAAAAAATC CATAAAATTG CACA GAAG AA 420 

AAGAAAGAAA AATAAAATAC ACAATATGGA CGATOGAGAA AAACAGTTAC An-nj ^X "i-rAT 480 

GGATCAA6AA GTTTGTGTAC ACATAATCTC ATTTTGAGAT ATATAACTAT TTTTGTCTTT 540 

C3U3AASTGAA TCAAAATATT TCAAAATGCT GTCTTATGAA ACT ACAATA T TCTCACAGAT 600 

TAGAAAAGTT TTTCTOTAAA A8TCAGATAG TAAATATTTT AGGTTTTGCA GTGTCTTTTG 660 

CAACTACTCA ACTTTCCTAC TGTAGCACAA GAGTAGCTGT GGTACTGTGC AAATAAATTG 720 

CTTGTGTTCC AATAAACCTT CATTTACAAA AACATGCCAT GGGCCATATT TGGCCTGTAC 780 

ACTGrrGTTT GCCAAGTCCT AATATACTTG CTTAGCAAGT ATTGTGA6CT ATTTGAGGAA 840 

GACATGAAAG TTCATTGGGT T0CTAAAAA6 TATGTAGAAA TTCAAAOOAA AATT AAAATT 900 

TAGGCTAAGT TATAATACAC TOTTTTAACA ATTOTAAAAT GTAAGAGAAA TTTACAAATA 960 
AAAATCCCAA ATAAAA 

Seq ID KO: 239 DNA sequence 

Nucleic Acid Accession #s NM_001786.1 

Coding sequence: 130-1023 

1 11 21 31 41 > 51 

I I I I i I 

GGGGGGGGGG GGCACTTGGC TTCAAAOCTG GCTCTTGGAA ATTGAGCGOA OAOOQAGGGG 60 

GTTGTTGTAG CTGC0SCT6C GGCOQCOGOG GAATAATAAG CCGG6ATCTA OCATACCCAT 120 

TQACTAACTA TX3GAAQATTA TACCAAAATA GAGAAAATTG GAGAAGGTAC CTATGGAGTT 180 

GTGT A TAAGG OTAGACACAA AACTACAGGT CAAGTGGTAG CCATGAAAAA AATCAGACTA 240 

GAAAGTGAAO AGGAAOGGGT TCCTAGTACT GCAATTCOGG AAATTTCTCT ATTAAAGGAA 300 

CTTCGTCATC CAAATATAGT CAGTCTTCAO GATCTGCTTA TGCAGGATTC CAOGTTATAT 360 

CTCATCTTTG AGTTTCTTTC CATGGATCTO AAGAAATACT TG6ATTCTAT CCCTCCTGGT 420 

CAOTACAIGG ATTCTTCACT TGTTAAGAGT TATTTATACC AAATCCTACA GGGGATTGTG 480 

TTTTOTCACT CTAGAAQAGT TCTTCACAGA GACTTAAAAC CTCAAAATCT CTTGATTGAT 540 

6ACAAA00AA CAATTAAACT GGCTGATTTT GGCCTTGCCA GAGCTTrPGG AATACCTATC 600 

AGAGTATATA CACATGAGGT AGTAACACTC TCGTACAQAT CTCCAGAAGT ATTGCTGGQO 660 

TCAOCTOQTT ACTCAACTCC AGTTGACATT TGGAGTATAG GCACCATATT TGCT GAACTA 720 

0CAACTAA6A AACCACTTTT CCAT6GGGAT TCA6AAATTG ATCAACTCTT CAGGATTTTC 780 

A6A0CTTT66 GCACTCCCAA TAATGAAGTG TGGCCAGAAG TGGAATCTTT ACAGGACTAT 840 

AAGAATACAT TTCCCAAATG GAAACCAGGA AGCCTAGCAT CCCATGTCAA AAACTTGGAT 900 

GAAAATGGCT TOGATTTGCT CTCGAAAATG TTAATCTATG ATCCAGCCAA ACGAATTTCT 960 

0GCAAAAT6G CACTGAATCA TCCATATTTT AATGATTTGG ACAATCAGAT TAAGAAGATO 1020 

TAOCTTTCTG ACAAAAAGTT TCCATATGTT ATGTCAACAG ATAGTTGTGT T TTTA TTGTT 1080 

AACTCTTGTC TATTTTTSTC TTATATATAT TTCTTTGTTA TCAAACTTCA GCTG TACTTC 1140 

GTCTTCTAAT TTGAAAAATA TAACTTAAAA ATGTAAATAT TCTAIATGAA TTTAAATATA 1200 
ATTCT8TAAA T0T6AAAAAA AAAAAAAAAA AAAAA 



CTGTTTCTTC CCCAATGGAO A660CAICIG CACG6GCTC6 1140 

GTTTGACCPG CGGGCAGACC AOGAGCTGAT CTGCTTCTCC 1200 

GATCAOSTCC GTGGOCTTCT CCCTCAGTGG CCGC CTACTA 1260 

CAACIGCAAT GTCTGG6ACT CCATGAAGTC TGAGCGTGIG 1320 

TAACAGGGTG AGCTGCCTGG GAGTCACAGC T6ACGGGATG 1380 

GGACAGCTTC CTCAAAATCT GGAACTGAGG AGGCTGGAGA 1440 

AACACACTCA GCAGCCCCCT GCCOGA CCCC ATCTCATTCA ISOO 

GGGTCCCATT CCCACTAAGC TTTCTCCTTT GAGGGCAGTG 1560 

TT6GGAGGCA GCATCAGGGA CACAGGGGCA AAGAACTGCC 1620 

CCCTCCCCAC AGTCCTCACA GCCTCTCCCT TAA TGAGCA A 1680 

GCCCTTTGCA GGCCCAGCAG ACITGAGTCT GAGGOCCCAG 1740 

GAGCCACTAC CTTTGTCCAG GOCT GG GTGG TATAGGGOGT 1600 

CTG6CACCAC TAGGGTOCTG GCCCTCTTCT TATTCATGCT 1860 

T'l'ltlT CTC CT AAGAG^OCTG CAATAAAGT6 TAGCACGCTQ 1920 



Seq ID NOi 240 Protein sequence: 
Protein Accession ft: IIP_001777.1 

1 11 21 31 41. 51 

I 1 I I i I 

MEDyTKIBKZ GBGTVGWYK 6RBRTTGQW .AMKKIRIiESE BBOVPSTAIR BISLLKBLRH 60 

PNIVSLQDVL MQDSRLYLIF EFLSMDIiKKY USIPPGOYM DSSLVKSYLY QILQGIVFCX 120 
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SRSVXARDLK PQHIiLIIIDKO TIKLADFGIA RAFGIPIRW TBBWTZ^mi SPEVLL6SAH 180 
ySTPVDXHSZ GTZFABIATK XPLEBGDSEI DQLFRIFSAL GTPNHEVHFE VBSLODYUIT 240 
FFXHRFGSIA SBVXNLDQIO LDIiZiSKKIiZy 0PAXRZS6RN AIAHPYFllDL DHQZKXM 



Seq ID NO I 241 ONA sequence 

MUeleic Acid Accession 9: !m_033379.1 

Coding sequence: 132-854 

1 11 21 31 41 SI 

I ] -I I I I 

CGCXXX30GCG C666CTCAAC TTTGTAGAGC GA666GCCAA CTTGGCAGAG 0GCG06GCCA fiO 

OCTTTGCAGA GAGOGCCCTC CAGGGACTAT GOGTGOGGGG ACAC3GGGATC TACCCATACC 120 

ATTGACTAAC 7ATGGAAGAT TATACCAAAA TAGAGAAAAT T66AGAAGGT ACCTATG6AG 180 

TTGTGTATAA GGGTAGACAC AAAACTACAG GTCAAGTGGT AGCCATGAAA AAAATCAGAC 240 

TAGAAAGTGA AGAGGAAGGG GTTCCTAGTA CTGCAATTOG GGAAATTTCT CTATTAAAGG 300 

AACXTOOTCA TCCAAATATA GTCAGTCTTC AGGATGTGCT TATGCAGGAT TCCAGGTTAT 360 

ATCTCATCTT TGABTTTCTT TCCATGGATC TGAAGAAATA CTPGGATTCT ATCCCTOCTG 420 

GTCAGTACAT GGATTCTTCA CTTGTTAAGG TAGTAACACT CTGGTACAGA TCTCCAGAAG 4 BO 

TATTGCTGGG GTCAGCTCGT TACTCAACTC CAGTTGACAT TTG6AGTATA GGCACCATAT 540 

TTGCTGAACT AGCAACTAAG AAACCACTTT TCCATGGGGA TTCAGAAATT GATCAACTCT 600 

TCAGGATTTT CAGASCTTTG 6GCACTCCCA ATAA7GAA6T 6TG6CCAGAA 6T6GAATCTT 660 

TACAGGACTA XAA8AATACA TTTCCCAAAT GGAAACCA66 AAGCCTAGCA TCC CA TSTCA 720 

AAAACTT06A TGAAAATG6C rTGGATTTGC TCT06AAAAT GTTAATCTAT GATCCAGOCA 780 

AAOGAATTTC TGGCAAAATG GCACTGAATC ATCCATATTT TAATGATTTG GACAATCAGA 840 

TTAAGAAGAT GTAGCTTTCT GACAAAAAGT TTCCATATGT TATGTCAACA GATAGTTGTG 900 

TTTTTATTGT TAACTCTT6T CTATTTTTGT CTTATATATA ' mfmvrr ATCAAACTTC 960 

A6CTGTACTT OGTCTTCTAA TTTGAAAAAT ATAACTTAAA AATGTAAATA TTCTATATGA lOiZO 
ATTTAAATAT AATTCTGTAA ATGTGAAAAA AAAAAAAAAA AAAAAA 

Seq ID NO: 242 Protein sequence: 
Protein Accession #; NP_20369a.l 

1 11 21 31 41 51 

i I I I i I 

MEDYTKIEKI GEGTYGWYK GRHKTTGQW AMKKIRLBSE EB6VPSTAIH EISLLKELRH 60 

PNIVSLQDVL MQDSRLYLIF EFLSMDUOCy LDSIPPGQYM DSSLVKWTL HYRSPBVLL6 120 

SARYSTPVDZ HSXGTZFAEL ATKXPLFB(3> SEIOQLFRZP RALGIPKMEV WPEVBSLQDY 180 
KNTFPKHKPG SLASHVXNLD WGWhLSVH LZYDPAKRZS GKNALNHPYP KDLDNQZXKM 

Seq ID NO: 243 DMA sequence 

Nucleic Add Accession AF101051.1 

Coding sequence I 221-856 

1 11 21 31 41 51 

I I I I I I 

GAGCAACCTC A6CTTCTAGT ATCCAGACTC CAGOGCCGCC CG6GGC60GG ACCCCAACCC 60 

CQAGCCAGAG CTTCTCCAGC 660GGGQCA6 CQAGCAGOGC TOCOCOCCTT AACTTCCTCC 120 

GOGGGGCCCA GCCACCTT06 6GAGTCGGGG TTOCCCACCT 6CAAACTCTC C6CCTTCTGC 180 

AOCTGOCACC CCTQAGCCA6 CG06G6C6CC OGAGCGAGTC ATG6CCAA0G CGGGGCTGCA 240 

OCTGTTQGGC TTCATTCTCG CCTTCCTGGG ATGGATCGGC GCCATCGTCA GCACTGCCCT 300 

QCCCCAGTOG AGGATTTACT CCTATGCCGG OGACAACATC GTGACCGCCC AGGCCATGTA 360 

CQAGGOGCTG TG6ATGTCCT GCGTGTG6CA GAGCAOOGOG CAGATCCAGT GCAAAOTCTT 420 

TGACTCCTTG CT6AATCTGA GCAGCACATT GCAAGCAACC C8TGCCTTGA TGGTG6TTG6 480 

CATCCTCCTG GGAGTGATAG CAATCTTTGT OGCCACCGTT GGCATGAAGT GTATGAAGTG 540 

CTTGGAAGAC GATGAGGT6C AGAAOATGAG GATG6CTGTC ATT6GGGGTG CGATATTTCT 600 

TCTTGCAGGT CTGOCTATTT TA6TT6CCAC AGCATGGTAT 66CAATAGAA TOGTTCAAGA ' 660 

ATTCTATGAC CCTATGACCC CA6TCAATGC CAGGTAOGAA TTTGGTCAGO CTCTCTTCAC 720 

TGGCTGGGCT GCT6CTTCTC TCTCCCTTCT GGGAGGTGCC CTACTTT6CT GTTOCTOTCC 780 

CC6AAAAACA ACCTCTTACC CAACACCAAG GCCCTATCCA AAACCTGCAC CTTCCAGOGG 640 

GAAAGACTAC GTGTOACACA GAGGCAAAAG GAGAAAATCA TGTTGAAACA AACCGAAAAT 900 

GGACATTGAG ATACTATCAT TAACATTAGG ACCTTAGAAT TTTGGGTATT GTAATCTGAA 960 

GTATGGTATT ACAAAACAAA CAAACAAACA AAAAAOCCAT G TG I T A AAAT ACTGAGTGCT 1020 

AAACATG6CT TAATCTTATT TTATCTTCTT TCCTCAATAT AGGAGGGAAQ ATmAOCAT 1080 

TTGTATTACT GCTTOCCATT QAGTAATCAT ACTCAAATGG GGGAAGGGGT GCTCCTTAAA 1140 

TATATATAGA TATOTATATA TACATGTTTT TCTATTAAAA ATA6ACAGTA AAATACTATT 1200 

CTCATTATGT TGATACTAGC ATACTTAAAA TATCTCTAAA ATAGGTAAAT GTATTTAATT 1260 

CCATATTQAT GAAGAT GTT T ATTOGTATAT mvm ' i ' ll ! GTOCTTATAT ACATATGTAA 1320 

CAGTCAAATA TCATTTACTC TTCTTCATTA GCTTT6GG7G CCTTTOCCAC AA6ACCTAGC 1380 

CTAATTTACC AA6GATGAAT TCTTTCAATT CTTCATGOST GCOCTTTTCA TATACTTATT 1440 

TTATTTTTTA CCATAATCTT ATAGCACTTO CAT06TTATT AAGCCCTTAT TTGTTTTGT6 1500 

TTTCATTGGT CTCTATCTCC TGAATCTAAC ACATTTCATA GOCTACATTT TAGTTTCTAA 1560 

AGCGAAOAAG AATTTA7TAC AAATCA6AAC T7TGGAGGCA AATCTTTCTG CATGACCAAA 1620 

GTGATAAATT OCTGTTGACC TTCCCACACA ATCCCTGTAC TCTGACCCAT AGCACTCTTG 1680 

TTTGCTTTGA AAATATTTGT CCAATT6AGT AGCT6CATGC TGTTCCCCCA GGTGTTGTAA 1740 

CACAACTTTA TTGATTGAAT TTITAAGCTA CTTATTCATA GTTTTATATC CCOCTAAACT 1800 

ACCTTTTTGT TCCCCATTCC TTAATTGTAT TGTTTTCCCA AGTGTAATTA TCATGOGTTT 1860 

TATATCTTCC TAATAAGGTG TG6TCTGTTT GTCTGAACAA AGTGCrAOAC TTTCT66AGT 1920 

GATAATCTGC TGACAAATAT TCTCTCTGTA GCTGTAAGCA AGTCACTTAA TCTTTCTACC 1980 

TCTTTTTTCT ATCTGCCAAA TTGAGATAAT GATACTTAAC CAGTTAGAAQ AGGTAGTGTG 2040 

AATATTAATT AGTTTATATT ACTCTCATTC TTTGAACATQ AACTATQCCT ATGTAQTGTC 2100 

TTTATTTGCT CAGCTGQCTQ AGACACPGAA GAAGTCACTG AACAAAACCT ACACAOGTAC 2160 

CTTCATQTGA TTCACTOCCT TCCTCTCTCT ACCAGTCTAT TTCCACTGAA CAAAACCTAC 2220 

ACACATACCT TCAT6TGGTT CAGTGCCTTC CTCTCTCTAC CAGTCTATTT CCACTGAACA 2280 

AAACCTAOGC ACATAOCTTC ATGT G GCICA GTGCCTTCCT CTCTCTACCA GTCTATTTCC 2340 

ATTCTTTCAG CTGTGTCTOl CATGTTT6T6 CTCTGTTCCA TTTTAACAAC TOCTCTTACT 2400 

TTTCCAGTCT GTACAGAATG CTATTTCACT TGAGCAA6AT (SITGTATGGA AAGGGTCTTG 2460 



281 



wo 02/086443 

GOICTGGTGT CTGGAGAGCT 6QATTTGAGT C rm tfi Xi CT A TCMTGACOS ' I V m ig mt S 2520 

ABCAAfiGGAT TT06CTGCTG TAAGCTTATT GCTTCATCTG TAAGC3GGTGG TTTGTAATTC 2580 

CTGATCTTCC CACCTCACAG TGATGTTGTG QGGATCCRGT GAGATAGAAT ACATGTAAGT 2640 

GTGGTTTTGT AATTTGAAAA GTGCTATACT AAGGGAAAGA ATTQJUSGAAT TAACTGCATA 2700 

OGTTTTGGTG TTGCTTTTCA AATGTTTGAA AATAAAAAAA TQTTAAGAAA TGGGTTTCTT 2760 

GGCTTAACCA GTCTCTCAAG TGATGAGAOV GTGAAGIAAA ATTGAGTQCA CTAAAC3GAAT 2620 

AAGATTCTGA GGAAGTCTTA TCTTCTGCA6 TGAGTAT6GC OCAATGCTTT CTGTG6CTAA 2880 

ACAGATGTAA TG6QAA6AAA TAAAAGCCTA CGTGTTG6TA AATCCAACA6 CAAGGGAGftT 2940 

mrGAATCA TAATAACTCA TAAGGTGCTA TCTGTTCAGT GATGCCCTCA GAGCTCTTGC 3000 

TGTTASCTGG CAGCTGAOGC T6CTAGGATA GTTAGTTTGG AAATGGTACT TCATAATAAA 3060 

CTACACAAGG AAAGTCAGCC A00GT6TCTT ATQAGGAATT GGAGCIAATA AATTTTAGTQ 3120 

T6CCTTCCAA ACCTGAQAAT ATATGCTTTT 6GAAGTTAAA ATTTAAATGG CTTTTQCCU: 3180 

ATACATAGAT CTTCATCATG TGTGAGTGTA ATTCCATGTG GATATCAGTT ACCAAACATT 3240 

ACAAAAAAAT TTTATGGCCC AAAATGACCA ACX3AAATTGT TACAATAGAA TTTATCCAAT 3300 

TTTGATCTTT TTATATTCTT CTACCACAOC TGGAAACAGA CC3^ATAGACA TTTTGGGGTT 3360 

TTATAATGGG AATTTGTATA AAGGATTACT CTITTTCAAT AAATrGTTTT TTAATTTAAA 3420 
AAAA6GAAAA AAAAAAAAAA AAA 



Seq ID NO: 244 Protein sequences 
Protein Accession ft: AAD16433.1 

1 II 21 31 41 51 

I I t I I I 

MANAGLQLLG PZLAFIiGWIG AXVSTALPQH RIYSYAGDNI VTAQAHYE6L WMSCVSQSTG 60 

QIQCKVFDSL LKLSSTLQAT RALHWGILL 6VIAIFVATV GMKCMKCLED DEVQKMRMAV 120 

IGGAIFLLAG LAILVATAWY GNRXVQEPYD PMTPVKARYB FGQAIiFTGMA AASLCLLGGA 180 
LLCCSCPRKT TSYPTPRPYP KPAPSSGKDY V 

Seq ID NOt 245 DNA sequence 

Nucleic Acid Accession #s C3VT cluster 

1 11 21 31 41 51 

I 1 111) 

' rm ' rrrrrr ' riTrrrrr T T tttttcaagq agagcacaag gaactttatt aatgactttc 60 

TTAATOGTTA AATOCrGTTT ACCAAGTGAC CCAGAGGCAG CGTGGTTTAG TGGTTTCAAC 120 

AGCATGGTCX: CGAGAGTCTG ACAAACCTCA GTTCAAATCC TTCTTTTGTC TTCACTTAGT 180 

TTTTCrrOCT 6AGATTTAGT TTCTTCATOG TTAACAATGA GGATATTAAT ATGTTTCACA 240 

CA6TTGTTAT GAAGAAT6CA TATAT7AGAA TGCCT6TA0T CTCAGCTACT CAG6AGGCTA 300 

AGGTGGGGAG GTCGCTCAAQ CCCAGGAATT CAAAOCTOCA ATGCATTATG ATTACAGCTG 360 

TTAATAGCCA CTGCACTTCA 6CCTGGGCAA TGTAGTAAGA TCCCATCTCT GGCTOGGAGG 420 
GTCCTACGCX: CACOGAGTCT 06CTGATTGC TAGCACAGCA GTCTGAGATC AAACT6CA 

Seq ZD NOt 246 DNA sequence 

Nucleic Acid Accession #t XM_058SS3.2 

Coding sequences 897-1400 

1 11 21 31 41 51 

I I I I I I 

AATTTTCAGA AGTTTCGTAT GGGGATGGTT TTATATAAAT TCAGGTTTTT CCCACAATAA 60 

TAAATGTATT TAGTCTCAGT GCTCAATAGA AQAGATTTCT AATAGAAAAG GATTCAAACT 120 

6T6AAACCAT TTCTCTTTTA ATGTTTCACA TTCCTGTTAC AGATTTGTTC TCTTGTGACT 180 

CT6TTATCCA TAATATGGAC AGTTCTT6A6 TCCTAACATT GAGAG6TTTT CCCTTAGTGC 240 

ATAGAGG6AA TGAGTATTAA TTG6AGAAGC TTAAA6TATT GCCACTTTAG CACTGAAGAT 300 

TGGGAT6AGA GGAGGTGAAA CCTCACTAGA AAAAGGGACA ATGTTAGT6T GGCCCTTCCT 360 

GATCATGTTT AAGAAAAGTC ATGAAAATQO TGAACTAGTG TTTCCAAGCA TATTGGAAGG 420 

GTT6AGTGTA TACTGTCTGT CAAAGACTTC CAGCATTTCC AOGTCCTAGA GAG6AACAAG 480 

ACTGGTAACC TGOCTATCTG TATTTTTAAO AACCCA66AS GAAAGCTTTA TAATA6AACA 540 

TTATTTCTOT OTTTATGTAT AACGGOTTTT I'lVrmTlTl ' AAAGACAGGA TCTCACTOCA 600 

rPGTCCAGGC CAAGTGCAAT GGCACGAACC TCATAGCTCC TGGACTTAAQ TGATCT6CCT 660 

GCCTTTGCCT CCT6AGTAGC TGGGACTACA GGCATGAGCC CCCATGCCTG GCTAAGTTTO 720 

rm ' mxj ' i ' X ' tgtttgtttg tttottttto gggggggttg ' rmGTrm ' tgtagagaoq 780 

TA6TCTT8CT TTGTTQCCAO GCCAOTCTCA AACTCCTG8C TTCAAGT6AT CCTOCTGCCT 840 

CA6CCTCCCA GA6TGCTAG6 ATTACAGCAC TTGGATTCAG CTTCTTCATT TCCAACATGG 900 

AAGAAACTTA CACOGACTCC CTC6ACCCTG AGAAGCTATT GCAATGCCCC TATGACAAAA 960 

ACCATCAAAT CAGGGCTTGC AGGTTTCCTT ATCATCTTAT CAAGTGCAGA AAOAATCATC 1020 

CTGATGTTGC AAGCAAATTG GCTACTTGTC CCTTCAATGC TC8CCACCA6 GTTCCT06AG 1080 

CtGAAATTAG TCATCATATC TCAAGCTGTG ATGACA6AA6 TTGTATTGAO CAAGATSTTO 1140 

TCAACCAAAC CAOGAGCCTT A6ACAAGA6A CTCTGGCTGA GAGCACTTGG CAGTGCCCTC 1200 

CTT60GATGA AGACTGGGAT AAAGATTTGT G6GAGCAGAC CAGCACCCCA TTT6TCTGGG 1260 

GCACAACrCA CTACTCTGAC AACAACAGCC CTGOGAGCAA CATAGTTACA GAACATAAGA 1320 

ATAACCTGGC rrCAGGCATG OGA6TTCCCA AATCTCTGCC GTATGTTCTO CCATGGAAAA 1380 

ACAATGGAAA TGCACAOXAA CIGAATAGCT ATCTCATCAA ATGCCAGACC CTAGAAGACT 1440 

Q TTGC n 'Cri' CTTCTACCAG T GG Q T TCTCA riTl ' CL l' C LT AATCZAATTA TAGAATGGTA 1500 

AACTCCCTGT GACTTTCCAA ACTGACAAGC ACACTTTTTT GCTCCCCCCT TGAATCCTCA 1560 
TTTAATQCAA GAACCCTCAT ACTCAfflkAGC TTCCAAATAA ACCTTTGATA CAGATTG 



Seq ID NO: 247 Protein sequence: 
Protein Accession #: XP_05B553.1 

1 11 21 31 41 51 

I I I I I i 

HBETYTDSIiD PEKLLQCPYD KNBQIRACRP PVHLZKCRKN HP DVASK IiAT CPFNARBQVP 60 
RABISBBZSS CDSRSCZEQD WNQTRSLRQ BTLABSTHQC PPCDEDHDKD IiHEQTSTPFV 120 
WGTTBySQNN SPASNXVTEB KNNIASGMRV PRSLPYVLFH RNNQNAQ 



282 



wo 02/086443 
Seq ID MOt 248 ONft sequence 
Nucleic Acid Accession ft* NN_003392 
Coding sequence I 758.. 1655 

5 

1 11 21 31 41 51 

I t I I I I 

TTAAGGAAAT CCGGGCTGCT CTTCCXXATC TGGAAGTGGC TTTCCCCACA TOGGCTOGTA 60 

AACTGATTAT GAAACATACX3 ATGTTAATTC GGAGCTGCAT TTCCCAGCTG GGCACTCTCG 120 

10 CXW6CTGGTC CCX33GGGCCT CGCCCCXCAC CCCCTGCCCT TCCCTCCOSC GTCCTGCCCC 180 

CATCCi q CAC CCCG0606CT Q6CCACCX:0S CCTCCTTGGC AGCCICTGGC 0GCAG06C6C 240 

TCCACroSCC TCCOGTGCTC CTCTCGCCCA TGGAATTAAT TCTGGCTCCA CTTGTTGCTC 300 

GGCCCAGGTT GGGGAGAGGA CGGAGGGTGG COGCAQCGGG TTCCT6AGTQ AATTACCCAG 360 

GAGGGACTGA GCACAGCACC AACTAGAGAG GGGTCAGGGG GTGOQGQACT C GAGCXS AGCA 420 

15 GGAAGGAGGC AGOGCCTGOC ACCAGGGCTT TGACTCAACA GAATTGAGAC AOCTTTGTAA 480 

TOGCTCGCGT GCCC060SCA CAGGATCCCA GOSAAAATCA GATTTOCIGG TGAG6TTG0G 540 

TGGGTGGATT AATTTGGAAA AAfiAAACTGC CTATATCTTG CCATCAAAAA AC TCACQ GAG 600 

GAGAAGOOCA GTCAATCAAC AGTAAACTTA AGA6ACCCCC GATGCTCCCX: TGGTTTAACT 660 

TGTATGCTTG AAAATTATCT GAGAGGGAAT AAACATCTTT TCCTTCTTCC CTCTCCAGAA 720 

20 GTCCATTGGA ATATTAAGCC CAGGAGTTGC TTTGGGGATG GCTGGAAC5T0 OATCTCTTC 780 

CAAGTTCTTC CTAGTGGCTT TGGCCATATT TTTCTCCTTC GCCCAGGTT6 TAATTGAAGC 840 

CAATTCTTGG TGGTOOCTAG CTATGAATAA CCCTGTTCAG ATGTCAGAAG TATATATTAT 900 

AG6AGCACA6 CCTCTCTGCA GCCAACTG6C AOGACTTTCT CAAGGACRGA AGAAACTGTG 960 

CCACTTGTAT CAG6ACCACA TGCAGTACAT C3GGAGAAGGC GOGAAGACAG GCATCAAAGA 1020 

25 ATGCCAGTAT CAATTCC6AC ATCGAOGGTG GAACTGCaGC ACTGTGGATA A CACC TCTgT 1080 

rrrrGGCAGG GTGATGCAGA TAGGCAGCOG OGAGACXSGCC TTCACATAOS COSTGAGCGC 1140 

AGCAGGG6TG GTGAACGCCA TGAGCOSQGC GTGCOGCGAG GGOSAGCTGT CCACCTGOGO 1200 

CIGCAOGOGC OC06060GCC OCAAGGAOCT GCOGOGQGAC TGGCTCTGGG GOGGCTGCGG 1260 

OGACAACATC GACTATGGCT ACOGCTTTGC CAAGGAGTTC GTGGACGCCX: GOGAGOGGGA 1320 

30 GCGCATCCAC GCCAAGGGCT CCTACJOAGAQ TGCT06CATC CTCATGAACC TGCACAACAA 1380 

CGAGGCOGGC OSCAGGAOGG TGTACAACCT GGCTGATGTG GCCTGCAAGT GCCATGGGGT 1440 

GTCOGGCTCA TGTAGCCTGA A6ACATGCT6 GCTGCAGCTG GCAGACTTCC GCAAGGTGGG IS 00 

TGATGCOCTO AAGGAQAAOT ACGACAGOGC GG0G6CCATG CXSQCTCAACA GCCGGGGCAA 1560 

GTTGGTACAG GTCAACA6CC 6CTTCAACTC GCSXACCACA CAAGACCTGG TCTACATCGA 1620 

35 CCCCAOCCCT GACTACTGCG TGC3GCAATGA GAGCACCGGC TCGCTGGGCA OOCAGQGOGQ 1680 

CXTOTGCAAC AAGAOGTCGG AGGGCATGGA TGGCTGCGAG CTCATGTGCT GC66C0GTGG 1740 

GTAOSACCAG TTCAAGACCG TGCAGA0G6A GCGCTGCCAC TGCAAGTTCC ACTGGTGCTG 1800 

CTAOGTCAAG TGCAAGAAGT GCA0G6AGAT OGTGGACCRG TTTQTOTGCA AGTAGTGGGT I860 

GCCACCCAGC ACTCA60CCC GCTCCCaGGA OCCGCTTATT TATAGAAAGT ACAGTGATTC 1920 

40 TGGTTTTTGG TTTTTAGAAA TATTTTTTAT TTTTCCCCAA GAATTGCAAC OGGAACX:aTT 1980 

TTTTTTCCTG TTACXATCTA AGAACTCTGT QGTTTATTAT TAATATTATA ATTATTATTT 2040 

G6CAATAATG GGGGTGGGAA CCACGAAAAA TATTTATTTT GTGGATCTTT GAAAAGGTAA 2100 

TACAA6ACTT CTTTTGOATA OTATAGAATG AAGGG66AAA TAACAC ATAC CCTAACTTAG 2160 

CTC3TGTGGGA CATGGTACAC ATCCAfiAAGG TAAAGAAATA CATTTTCTTT TTCTCAAATA 2220 

45 TGCCATCATA TQQGATGGGT AGGTTOCACT TGAAAGAGGG T6GTAGAAAT CTATTCACAA 2280 

TTCAGCTTCT ATGACCAAAA TGAGTTGTAA ATTCTCTGGT GCAAGATAAA AGG TCTT GGG 2340 

AAAACAAAAC AAAACAAAAC AAACCTCCCT TCCCCAGCAG G6CTGCTAGC TTGCTTTCTG 2400 

CATTTTCAAA ATGATAATTT ACAATGGAAG GACAAQAATO TCATATTCTC AAG6AAAAAA 2460 

GGTATATCAC ATOTCTCATT CTCCTCAAAT ATTCCATTTO CAGACRGACC GTCATATTCT 2520 

50 AATAGCTCAT GAAATTTGGG CAGCAGGGA6 GAAAGTCCCC AGAAATTAAA AA ATTTA AAA 2580 

CTCTTATX?rC AAGATGTTGA TTTGAAGCTG TTATAAGAAT TGGG ATTC CA GATTTGTAAA 2640 

AAGACCCCCA ATGATTCTGG ACACTAGATT TTTTGTTTGG G6AGGTTGGC TTQAACATAA 2700 

ATOAAATATC CTGTATTTTC TTAOGQATAC TTGGTTAGTA AATTATAATA QTAGAAATAA 2760 

TACAIGAATC CCATTCACAQ GrCTCTCAGC OTUVGCAACA AGGTAATTGC GTGCCATTCA 2820 

55 GCACTGCAOC A6AGCAGACA ACCTATTTGA GGRAAAACAG TGAAATCCAC CTTCCTCTTC 2880 

ACACTGAGCC CTCTCTGATT CXTTCCXSTGTT GTGATOTGAT GCTGGCCA06 TTTCCAAACX5 2940 

GCAGCTCCAC TGGGTCCCCT TTGGTTGTAG GACAGGAAAT GAAACATTAG GAGCTCTGCT 300O 

TGGAAAACAG TTCRCTACTT AOOGATTTTT GTTTCCTAAA ACTTTTATTT TGAGGRG CftG 30 60 

TAGTTTTCTA TGTTTTAATO ACAGAACTTG GCTAATGGAA TTCACAGAGG TGTTGCAGCG 3120 

60 TATCACTGTT AT6ATCCTGT GTTTAGATTA TCCACTCATG CTTCTCCTAT TGTACTGCAG 3180 

GTGTACCTTA AAACTGTTCC CAGTGTACTT GAACAGTTOC ATTTATAAGG GGGGAAAT6T 3240 

OGTTTAATGG TGCCTGATAT CTCAAAGTCT TTTGTACATA ACATATATAT ATATATACAT 3300 

ATATATAAAT ATAAATATAA ATATATCTCA TT0CA6CCA0 TQATTTAGAT T TACAS CTTA 3360 

CTCTG6GGTT ATCTCTCTGT CTAGAGCATT GTTGTCCTTC ACTGCAGTCC AGTTOGGATT 3420 

65 ATTCCAAAAG TTTTTTGAGT CTT6AGCTTG GGCTGTGGOC COGCTGTGAT CATACCCT6A 3480 

GCACGACX3AA OCAACCTCGT TTCTGAGGAA GAAGCTTQAG TTCT6ACTCA CTGAAATGCG 3540 

TGTTGGGTTG AAGATATCTT TTTTTCTTTT CXGCCTCACC CCTTTGTCTC C3VACCTCCAT 3600 

TTCTOTTCAC TTPGTOGAGA GGGCATTACT TGTT08TTAT AGACATOGAC aTTAAOAGAT 3660 

ATTCAAAACT CAGAA6CATC AGCAATQTTT CTCTTTTCTT ACTTCATTCT GCAGAATOGA 3720 

70 AACCCATGCC TATTAGAAAT GACAGTACTT ATTAATTGAO TCCCTAAQGA A TATT CAGCC 3780 

CACTACATAG ATAGCTTTTT ' m ' Vl ' VmT TTTTrTTTAA TAAGGA CACC TCTTTCCAAA 3840 

CAGGCCATCA AATAT3TTCT TATCTCAGAC TTAC3GTTGTT TTAAAA6TTT GGAAA6ATAC 3900 

ACATCTTTTC ATACCCCCCC TTAGGAGGTT GGOCTTTCAT ATCAOC TCAQ OCAACTGTGG 3960 

CTCTTAATTT ATTGCATAAT GATATCCACA TCAGCCAACT GTGGCTCTTT AATTTATTGC 4020 

75 ATAATGATAT TCACATCOCC TCAGTTGCAG TGAATTGTGA GCAAAAGATC TTGAAAGCAA 4080 

AAAGCACTAA TTAGTTTAAA ATGTCACTTT TTTGGmTT ATTAT ACAAA AACCATGAAG 4140 

TACTTTTTTT ATTTGCTAAA TCAGATTGTT OCTTTTTAGT GACTCATGTT TATGAAGA6A 4200 

GTTQAGTTTA ACAATCCTAG CTTTTAAAAG AAACT ATTT A ATGTAA AATA TTCT ACATG T 4260 

_ - CATTCAGATA TTATGTATAT CTTCTAOCCT TTATTCTGTA CTTTTAATGT ACATATTTCT 4320 

80 GTCTI G OgTG ATnCTATAT TTCACTGGTT TAAAAAACAA ACATOGAAAG GCTTATTOCA 4380 
AATQGAAGAT AQUVTATAAA ATAAAAOGTT ACTTGTAAAA AAAAAAAA 

85 
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Seq ZD NO: 249 Protein sequence: 
Protein Accession S: hp 003383 



11 



21 
I 



31 



41 

I 



51 
I 
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HA6SAMSSKP FLVALAIFFS FAQWIBANS mrSUSMMNFV 
SQ6QXKLCHZ» YQOBKQYXGB GAKTGIKEOQ YQFRHRHIQfC 
AFTYAVSAAG WKAMSRACR EGELSTCGCS RAARPKDLPR 
FVDARERERI HAKGSYE5AR ILKNX^HNNBA GRRTVYNLAD 
LADPRKVGDA LKBKYDSAAA KRLNSHGKLV QVNSRPKSPT 
GSLGTQGRLC NKTSBC»DGC BLMCGGRGYD QFKrVQTERC 
QPVCK 

Seq ID NO: 250 DMA sequence 
Nucleic Acid Accession NM_0i405a 
Coding sequence: S6..1324 



PCT/US02/12476 



QKSBVyZIGA 
STVDNTSVFG 
DWLKGGCGDN 
YACKCHGVSO 
TQDLVYIDPS 
HOCFHWCCYV 



TGACTTGGAT 
TC36GCCAGAT 
C6TCATCTTC 
GASATATAAT 
ACTATATGCT 
TGAATGAAT6 
TCAGGTTATC 
TAGATTTCAC 
TGAAAAGCTG 
AAAAATCAAC 
TAAAACTCTA 
GCOCTGGCAG 
TGCCACATGG 
GACTGCTTCC 
AATTGTCCAT 
TTCTAGCCCT 
TGAGTTTCAA 
TTACAGTCAA 
TGAACCTCAA 
AGGAAAAACA 
AGATATCTGQ 
GCCTGGTGTT 
CTAA6AGA6A 
OCATTTTTAG 
AATAAACTGT 



11 

1 

GTAGAOCTOG 
GTG6TGAGG6 
ATATCCCTGA 
CAAAAGAAGA 
GAGTTTGGCA 
6TQAAAAATG 
AAGTTCAGTC 
TCTACT6AG6 
CAAGATGCTG 
AAGACAGAAA 
GGTCAGAGTC 
GCTAGCXnXSC 
CTTGTGAGTG 
TTTGGAGTAA 
GAAAAATACA 
GTTCCCTACA 
CCAG6TGATG 
AATCATCTTC 
GCTTACAATG 
GATGCATGCC 
TACCTT6CTG 
TATACTAGAG 
AAAGCCTCAT 
AGATACAGAA 
TTGCTT6AT6 



21 
I 

ACCTTCACAG 
CTAGGAAAAO 
TTGTCCTGGC 
CCTACAATTA 
GAGAGGCTTC 
CATTTTATAA 
AACA6AAGCA 
ATCCTGAAAC 
TAGGACCGCC 
CAGACA6CTA 
TCAG6AT06T 
AGTGGGATGG 
CTGCTCACTG 
CAATAAAACC 
AACACXXIATC 
CAAATGCAGT 
TGATGTTTGT 
GACAAGCACA 
ACGCX31TAAC 
AGGGTGACTC 
GAATAGTGAG 
TTACGGCCTT 
GGAACAGATA 
TTGGAGAAGA 
CAAAAAAAAA 



31 
I 

GACTCTTCAT 
AOTTTGnGG 
AGTGT6CATT 
CTATAGCACA 
TAACAATTTT 
ATCTCCA7TA 
TGGAGT6TTG 
T6TAGATAAA 
TAAAGTAGAT 
TCTAAACCAT 
TGGTGGGACA 
GAGTCATOGC 
TTTTACAACA 
TTOGAAAATG 
ACATGACTAT 
ACATAGA6TT 
GACAGGATTT 
GGTGACTCTC 
TCCTAGAATG 
TGGAGGACCA 
CTGGGGAGAT 
GOGGGACTGG 
ACATTTTTTT 
CTTGCAAAAC 
A 



41 

1 

TGCTGGTTGG 
6AAC0CT6GG 
GGACTCACTG 
TTGTCATTTA 
ACAGAAATGA 
AGGGAAGAAT 
GCTCATATGC 
ATTGrTTCAAC 
CCTCACTCAG 
TGCTGOGGAA 
GAA6TAGAA6 
T6TGGAGCAA 
TATAAGAACC 
AAAOdGGGTC 
GATATTTCTC 
TGTCTCCCTG 
GGAGCACTGA 
ATAGAaSCTA 
TTATGT6CT0 
CTGGTTAGTT 
GAATGTG06A 
ATTACTTCAA 
TTGTTTTTTG 
ACCTAGATTT 



Seq ID NO: 251 Protein sequence: 
Protein Accession #i NP 054777 



MYRPDWRAR 
DKLYABFGRB 
ICRFHSTBDP 
RSKTLGQSLR 
RWTASPGVTI 
SYEFQP6DVM 
XiBGKTDAOQO 
GI 



11 
I 

KRVCWEFWVI 
ASHNFTEMSQ 
ETVDKXVQLV 
rVGGTEVEBG 
KPSKHKRGIiR 
FVTGFGALKN 
DSG6PLV8SD 



21 
I 

GLVZFISLIV 
RLBSMVRNAF 
LBEKLQ&AVO 
EtfPWQASLQW 
RIIVHEKYKH 
DGYSQNKLRQ 
ARDIWYLAGI 



31 

I 

LAVCZ6LTVH 
YK8PLREEFV 
PPKVDPHSVK 
DGSHROGATL 
PSHDYDISIiA 
AQVTLIDATT 
VSNS>ECAKP 



41 

I 

YVRYNQKKTY 
XSQVZXFSQQ 
IKKXNKTETD 
ZNATWLVSAA 
ELSSPVPYTO 
OTEPQAYNDA 
NXPOVYTRVT 



QPLCSQLAGL 
RVI4QIGSRET 
IDYGYRFAKE 
SCSLKTCWLQ 
PDYCVHNEST 
KCKKCTBIVD 



51 
I 

CAATGATGTA 
TTATGGGCCT 
TTOVTTATGT 
CAACTGACAA 
GCCAGAGACT 
TTGTCAAGTC 
TGTTGATTTG 
TTGTTTTACA 
TTAAAATTAA 
CACGAAGAAG 
AG6GTGAATG 
CCTTAATTAA 
CTGCCAGATG 
TCCGGAEAAT 
TTGCAGAGCT 
ATGCATCCTA 
AAAAIGATGO 
CAACTTGCAA 
GCTCCTTAGA 
CAGATGCTAG 
AACCCAACAA 
AAACTGGTAT 
GGTGTGGA6G 
GACIXSATCTC 



51 
I 

NYYSTLSFTT 
KBGVLAHMIjL 
SYIiNHCOGTR 
HCFTTYKNPA 
AVERVCLPDA 
ZTPRMLCAGS 
ALRDMZTSiCT 



Seg ID NO: 252 DNA sequence 

Nucleic Acid Accession #: MM_003504.2 

Ooding sequence: 71-1771 ~ 



GGCACGAGOC 
OGCOGTGGCr 
GAG06TCCTT 
GGCCTTGTTC 
ACTT6AAACT 
TGGA6CTAAT 
GT6TGACA0C 
ACTCATTAAA 
AGA6GAGGAT 
CACAOGGTTA 
G6A6G0C06G 
GTCAGCCATQ 
GTGGTQOQCC 
ATA0GT6ACT 
GGAT6AGGAG 
CCT6GTGCTC 
ASCCAGQTTC 
CATGGaTCTT 
GGAGAATTTG 
OGTGCAGACT 
CTTTGCCACC 
CATOCAOGCT 
ACT06CCAA6 
CCTCGTCATC 
CATGCTGTTC 
TCTGTGTTCG 



11 
I 

CTGGT6C0GC 
ATGTTO G TGT 
CTCTTOGTGG 
CAGTGTGACC 
GCATTTCTTG 
GTAGACCTAT 
CATAOGGCAO 
CAA6AT6ATG 
GAA6AGCATT 
GAAGAGGAGA 
AGAAGAGACA 
GTQATGTTT3 
ATOGTTGGAC 
GATGTT6GTG 
AACACACrCT 
TACCA GCACT 
AAGCTGT6GT 
CCCCTGAA6C 
CGGGAAATGA 
TTCAGCATTC 
ATOTCTTTGA 
CTGGACAGOC 
AA6CA6CTGC 
TCCCAGGGGC 
TCXAQGC06G 
ACAAAGAACC 



21 

1 

06GGCTCTT6 
CG6ATTTCGG 
CCT0GGA06T 
ACGTGCAATA 
AGCATAAAGA 
TGGATATTCT 
TCAATGTCGT 
ACCTT6AAGT 
CA66AAATGA 
TAGTGGAGCA 
TOCTCTTTGA 
AGCTG6CTT6 
TAACAGACCA 
TCCTGCAGCG 
COGTGGACTO 
GGTCCCTCCA 
CTB16CATGG 
AGGTGAA6CA 
TTGAAGAGTC 
ATTTTGGGTT 
TGGAQAGCXX? 
TCTCCAGQAO 
GAGC CACCCA 
CTTTCCTGTA 
CATCCCTAAG 
GG06CTGCAA 



31 
I 

GTACCTCAGC 
CftAAGAGTTC 
GGATGCTCTG 
TAOOCTGGTT 
ACAGTTTCAT 
TCAACCTGAT 
CAATGTATAC 
TCCCGCCTAT 
CAGTGATGGG 
AACCATGCGG 
CTACGAGCAG 
GATGCTGTCC 
G T GG GT G CAA 
OCAOGTTTCC 
CACAOGGATC 
TGACAGCCTG 
ACAGAA60GG 
GAAGTTCCAO 
TGCAAATAAA 
CAAGCACAAG 
CSASAAGGAT 
TAA0CTG6AC 
6CAGACCATT 
CTGCTCTCTC 
OCTGCTCAOC 
ACTGCTGCCC 



41 

1 

6CGAG06CCA 
TACGA66TG6 
TGTGGGT6CA 
CCAGTTTCTO 
TATTTTATTC 
GAAGACACTA 
AACX3ATACXC 
GAAGACKTCT 
TCAGAGCCTT 
AGGA6GCAGC 
TATGAATATC 
AA66ACCTGA 
GACAAGKTCA 
CGCXACAACC 
TCCTTTGAGT 
T6CAACACCA 
CTCCAG6AGT 
GCCATGGACA 
TTTGGGATGA 
TTTCTGGOCA 
GGC7CAGGQA 
AAGCTGTACC 
6CCA8CTGCC 
ATGQAGGGOV 
AAACACCTGC 
CTGGTGATGG 



60 
120 
180 
240 
300 
360 



60 
120 
IBO 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 

loeo 

1140 
1200 
1260 
1320 
1360 
1440 



60 
120 
160 
240 
300 
360 
420 



51 
I 

GGOGTCGGGC 
TOCAGAGCCA 
AGATCCTTCA 
GGTGGCAAGA 
TCATAAACTG 
TATTCTTT6T 
A6ATGAAATT 
TCAGGGATGA 
CTGAGAAGOG 
GGCGAGAGTG 
ATGGGACATC 
ATGACATGCT 
CrCAAATGAA 
ACCGGAACQA 
ATGACCTCCG 
GCTATACOGC 
TCCTT6CAGA 
TCTCCTIGAA 
AQGACATGOG 
GOGAOGTGGT 
CA6ATCACTT 
ATOGCCTGOA 
TTTGCACCAA 
CTCCAGATGT 
TCAAGTCCTT 
CTGCCOOCCT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
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GAGCATGGAG CA'TCGCACAG TGACOGTGGT G6GCATOCCC 0CA6AGACC8 ACAGCTOQGA 1620 

CftGGAAGAAC TTTTTTGGGA GOGOGTTTGA 6AA6GCA6CX3 GAAA6CACCA GCTC0006AT 1660 

GCTGCACAAC CATTTTGACC TCTCAGTAAT TGAGCXtSAAA GCTGAGGATC GGAGCAAGTT 1740 

TCTGGAOGCA CTTATTTCCC TCCTGrCXTTA GGAATTTGAT TCITCCAGAA TGACCTTCTT 1800 

ATTTATGTAA CltSGCTTTCA TTTAGATTGT AAGTTATC5GA CATGATTTGA GATGTAGAAG I860 

ocArrrrrEA ttaaataaaa tgcttatttt agggtcggtc cccsu^aaaaa aaaaaaaam^ 1920 

AAAAAAAAAA AA 



Seq ID NO: 253 Protein sequence: 
Protein Accession 9: tIP_003495.1 



1 11 21 31 41 51 

1 I 1 I I I 

MPVSDFRKEP YEWQSQRVL LFVASDVDAL CACKIXiQALP QGDHVQYTLV PVSGWQELBT 60 

AFIiEHKBQFH YFILINOQAN VDLLDIZiQPD EDTZFFVCDT ERPVNWNVY NDTQZKLLIK 120 

QDDDLEVPAY EDIFRDSEED EEH86IIDSD6 SEPSEKRT&L EEEZVBQriKR RRQKEtEMBAR 180 

RSDILFDYEQ YEYHGTSSAN VMFELAHMLS KDLKDMLHHA ZVGLTDQfWVQ DKITOMKYVT 240 

DVGVLQRHVS RHNHHNSDEE NTLSVDCTRI SFEYDLRLVL YQHWSLHDSX* aJTSYTAARP 300 

KLWSVHGQKR LQEFLADMSL PLKQVKQKPQ AMDZSLKEHL RBKIEESAKK POtKDMRVQT 360 

PSIHFGFKHK FLASDWFAT MSLMBSPEKD GSGTDBPZQA UJSLSRSSLD KLYKGLELAK 420 

KQLRATQQTI ASCbCTiniVI SQGPPLYCSL MB6TFD>VMLP SRPASLSLLS KHLIiKSFVCS 480 

TKHRRCKLLP LVKAAPLSMB BGTVTWGIP PETDSSDRKK FFGRAFEKAA ESTSSBMLRH 540 
KFDLSVIEUC AEDRSKPLDA LZSUiS 

Seq ID HO: 254 DKA sequence 
Nucleic Acid Accession 5: N«_022337 
Coding sequence t 48.. 683 



1 11 21 31 41 51 

I I I I I I 

GGCTCCGCTT CCCTGGTCAG GCAOOGCAOG TCTGGCCGGC CGCCAGGATG CAGGCCCOGC 60 

ACAAGGAOCA OCTGTACAAG TT0CTG8TQA TT6GGGACCT OGGOGTGGGG AAGACCAGTA 120 

TGATCAAGOQ CTACGTGCAC CAGAACPTCT CCTCGCACTA COSGGCXaVCA ATOSGOGTGG 180 

ACTTCGCGCT CAAGGTGCTC CACTGGGACC OGGAGACTGT GGTGCX3CCTG CAGCTCTGGG 240 

ATATCGCAGG TCAAGAAAGA TTTGGAAACA TGACGAGGGT CTATTACCGA GAAGCTATGG 300 

GTGCATTTAT TGTCTTCGAT GTCAOCAGGC CAGCCACATT TGAAGCAGTG GCAAAGTGGA 360 

AAAAT6ATTT GQACTOCAAG TTAAGTCTGC CTAAIGGCAA ACOGGTrrCA 6TG6TTTTGT 420 

TGGCCAACAA ATGTOACCAG GG6AA0GATG T6CTCATGAA CAATGGCCTC AAGATGGACC 480 

AGTTCTGCAA GGAGCAOGGT TTCGTAGGAT GGTTTGAAAC ATCAGCAAAG GAAAATATAA 540 

ACATTGATGA AGCCTCCAGA TGCXrTGOTGA AACACATACT TGCAAATGAG TGTGACCTAA 600 

TGGAGTCTAT TGAGCCXK3AC GT0GT6AAGC CCCATCTCAC ATCAACCAAG GTTGCCAGCT 660 

GCTCTGOCTO TGCCAAATCC TA0TA66CAC CTTTGCTSGT GTCTGGTAG6 AATGACCTCA 720 

TTGTTCCACA AATTGTGCCT CTATTTTTAC CATTTTGGOT AAAOSTCAGG ATAQATATAC 780 

CACATGTGGC AAGCCAAAGA TCTATGCCTC TGTTTTTTCA ATGAGAGAGA AATAGCAAAT 840 

GTTCTTTCTA TGCTTTCCTC ACCATCATCA CAGTGTTTAC AAACTTTTGA AAATATTTAG 900 

TCTGTTACAA ACTTCTGTCA TGTAGCT6AC CAAAATCCTG CA6GGCCACA GTOGGCACTG 960 

TTATTTGCTT CTTTTAATCA GCAAAGGCCT CAAOTCTTAA AATAAAAGG6 GAGAAGAACA 1020 

AACTAGCTGT CAAGTCAAGO ACTGGCTTTC ACCTTGOCCT OtfimcmT TCCAGATTTC lOBO 

AATATATTCT CTCATGGCCT GACAGGCCTA TTAAGTAGAT GTGATATTTT CITCCAAGAT 1140 

GACCTCCATT CTCGGCAGAC CTAAGAGTTG CCTCTGAGTT AGCTCTTTGa AATOGTGAAC 1200 

ACAGGTGTGC TATATTGTCC TTGTCCTAAC TGTCACTTGC CATGGCCTGA ATGT TGGCT T 1260 

AACTGAATAT TGTAIX3AAAA GACATQCCTC CATATGTGCC TTTCTGTTAG CTCTCTT3GA 1320 

CTCAAOCTQT QGGOCTCCTC TATACATGCT ATACATGTAA TATATATTAT ATATATTTTT 1380 
GCAAGTGAAC AATAAAACAT TAAAAGATAA AA 

Seq ID NO: 255 Protein sequences 
Protein Accession fit NP_071732 



1 11 21 31 41 -51 

I I I I 1 I 

MQAPHREHLY KZiLVZGDLGV 6KTSZZKRYV HONFSSBYRA TIGVDFALKV IjBWDPETWR 60 

LQLWDIAGQB RFGHMTRVYY SEANGAFIVF DVTRPATPSA VAKWKNDLDS RLSLPHGXFV 120 

SWIiLANKCD QGKDVLMNKO LRMDQFCKEE GFVGHFETSA KBEm^ZDEAS RCLVKHZIiAN 180 
BCDLMBSZSP DWKFHL78T KVASCS6CAK S 

Seq ID NO: 256 DNA sequence 
Nucleic Acid Accession fti NM_016321 
Coding sequence: 25.. 1464 

1 11 21 31 41 51 

I I I I I I 

G6AACG6CCC GCTGCCAGGC OGGCCAGGCA CCCCTGCAQC ATG6CCTGGA ACA0CAACC7 60 

C06CTGG0GG CTGCCGCTCA CCTGCCTGCT CCTGCAGGTG ATTATGGTGA TTCTCTTOGG 120 

GGTGTTCGTG OGCTACGACT TCGAGGCCGA CGCCCACTGG TGGTCAGAGA GGACGCACAA 180 

6AACTTGA0C GACATGGAGA ACGAATTCTA CTATC GCTA C CCAAGCTTCC AGGAOGTGCA 240 

GGTGATGGTC TTGGTGG6CT TCG6CTTCCT CAT6AC7T7C CTGCAGOGCT A0GGCTTCA6 300 

CGCOGTGGGC TTCAACTTOC T6TTGGCA6C CTTOGGCATC CAGTC060GC TGCTCATGCA 360 

GGGCTGGTTC CACTTCTTAC AAGAOCGCTA CATOGTCGTG GGC6TGGAGA ACCTCATCAA 420 

CGCTGACTTC TGCGTGGCCT CXtSTCTGOST GGCCTTTGGG GCAGTTCTGG GTAAAGTCAG 480 

CCCCATTCAG CTGCTCATCA TGACTTTCTT CCAAGTGACC CTCTTCGCTO TGAATGAGTT 540 

CATTCTCCTT AACCTGCTAA AGGTGAAOGA TGCAGGAGGC TCCATGACCA TCCACACATT 600 

TGGOGCCTAC TTTGGGCTCA CAGTXUVCCOG GATOCTCTAC 0GA06CAA0C TAGAGCACaG 660 

CAAGGAGAGA CAGAATTCTG TGTACCAGTC GGAOCTCTTT GCCATGATTG GCACCCTCTT 720 

CCTGTGGATG TACTGGCCCA GCTTCAACTC AGCCA TATC C TACCATGGGG ACAGCCA6CA 780 

COSAGOCSOC ATCAACACCr ACXGCTCCTT GGCAGCCTGC GTGCTTAOCT OQGIGGCAAT 840 
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ATCCAGTGCC CIGCACAAGA AGGGCAAOCT GGACATGGTO CACATC CAGA ATGCCAOGCT 900 

GGCAGGAGGG GTGGOOGTGG GTA006CTGC TGAGATGATG CTCATGCCTT ACGGTGCCCT 960 

CATCATCGGC TTOGTCTGOG GCATCATCTC CACCCTGGGT TTTGTATAOC TGACCCCATT 1020 

CCTGGAGTCC 0G6CTGCACA TCCAGGACAC ATGTGGCATT AACAATCTGC ATGGCATTOC 1080 

TG6CATCATA 6606GCATCG TGGGTGCTGT GACAGOSGCX: TCOGCCAGGC TT6AAGTCTA 1140 

TGGAAAA6AA OGGCTTGTCC ATTOCTTTGA CTTTCAAflOT TTCAA0GGG6 ACI GGAOO SC 1200 

AAGAACACAG GSAAAQTTCC A6ATTTATGG TCTCTTGGTG AOCCTOGOCA TGGCCCTGAT 1260 

•GGGTGGCATC ATTGTGGGGC TCATTTTGAG ATTACCATTC TGGGGACAAC CTTCAGATGA 1320 

GAACTGCTTT GAGGATGOGG TCTACTGGGA GATGOCTGAA GGGAACAGCA CTGTCTACAT 1380 

OOCTGAGGAC CCCACCTTCA AGCCCTCAGG ACOCTCAGTA CCCTCACTAC CCAT6G7GTC 1440 

OCCACIACCC ATGGCTTCCI OGGTAOOCTT 6GTAG0CTAG GCTOCCAGQQ CAGQTGAG6A 1500 

GCAGGCTCCA CAGACTSTCC TGGGGCCCAG A6SAGCTGGT GCTGACCTAO CTAGGGAtGC 1560 

AAGAGTGAGC AAGCAGCACC CCCACCT6CT GGCTTGGCCT CAAGGTGCCT CCACCCCTGC 1620 

CCTCCCCTTC ATCCCAGGGG GTCTOJCTGA GAATGGAGAA GGAGAA GCTA CAAAG TGGGC 1680 

ATCCAAGCOG GGTTCTGGCT 6CAGAAGTTC TGCCTCTGCC TGGGGTCTTG GCCACATPGG 1740 

AGAAAAACAG GCTGAAAGTG G66CTGGGAC CTGGTGGGTG AAOCTGAGCT CTCOCAGGAG 1800 

ACAACTTA6C 7GCCAGTCAC CACCTATGAG GCTCTTCTAC 0C0GT6OCTG CACCT0Q6CC 1660 

AGCATCTCCT ATGCTCOCTG GGTCCCCCAO AOCTCTCTOT GTTGTGrGCS TGGCAGCCTC 1920 
CAG6AATAAA CATTCTTGTT GTCCTTTGXA AAAAAAAAAA AAAAA A AA 

Seq ID NO: 257 Protein sequence! 
Protein Accession #: NP^OS740S 

1 • 11 21 31 41 51 

I I I I I I 

MAWNTNIiRWR LPLTCLLLQV IKVILPGVPV RYDFEADAHW NSERTHIOILS DMENBFYYRY 60 

PSFQDVHVMV FVGPGFLMTF LQBYGPSAVG FNFLLAAPQI QHALM5QGHP HPLQDRyiW 120 

GVENLINADP CVASVCVAFG AVLGKVSPIQ LLIMTFFQVT LPAVNEPILL NLLKVKDAGG 180 

SfMTIHTPGAY PGLTVTRILY RRNLBQSKER QNSVYQSDLF AMIGTLFLWM YWPSPNSAIS 240 

YHGD8QHRAA INTYCSLAAC VLTSVAISSA liHKKGKLDMV HIQNATLAGG VAV6TAAEMM 300 

LMPYGALIIG FVOGIISTLG FVYLTPPLES RLHIQDTOGI MNIiHGZFGIX 6GIVGAVTAA 360 

SASLEVYGKE GLVHSFDFQG FNGDHTARTQ GKFQIYGLLV TLAMALMGGI IV6LILKLPF 420 
WGQPSDEHCF EDAVYWEMPE GNSTVYIPBD PTFKPSGPSV PSVPMVSPLP MASSVPLVP 

Seq ID NO: 258 DNA sequence 

Nucleic Acid Accession #: NM_002358.2 

Coding sequence: 7 5.. 6 92 

1 11 21 31 41 51 

'I'll' 

GGGAAGTGCT GTTGGAGCOQ CTGTGGTTGC TGTCCGCGGA GTGGAAGCGC GTGCTTTTGT 60 

TTGTCTCCXrr 0GCCA3GG08 CTGCAGCTCT CTOGGGAGCA GG6AATCACC CTGCGC GGGA 120 

G03C0SAAAT OGTGGCOGAG TTCTTCTCAT TOGGCATCAA CAOCATTTTA TAT CAGCGT G 180 

GCATATATCC ATCTGAAACC TTTACTCGAG TGCAGAAATA OQGACTCACC TTGCTTGTAA 240 

CTACTGATCT TGAGCTCATA AAATACCTAA ATAATGTGGT GGAACAACTG AAAGATTGGT 300 

TATACAAGTG TTCAGTTCAG AAACTQGTTG TAGTTATCTC AAATATTGAA AGTGGTGAGG 360 

TGCTGGAAAG ATGGCAGTTT GATATT6AGT GTGAGAAGAC TGCAAAAGAT GACAGTGCAC 420 

CCAGA6AAAA GTCTCAGAAA GCTATCCAGO. ATGAAATCOO TTCAGTGATC AGACAGATCA 480 

CAGCTAC6GT GACATTTCTG CCACTGTTGG AAGTTTCTTG TTCATTTGAT CTGCTGATTT 540 

ATACAGACAA AGATTTGGTT GTACCTGAAA AATOQGAAGA GTCGGGACCA CAGTTTATTA 600 

CCAATTCTGA GGAAGTCOBC CTTCGTTCAT TTACTACTAC AATCCACAAA GTAAATAGCA 660 

TGGTGOCCTA CAAAATTOCT GTCaATGACT 6AGGATGACA TSAGQAAAAT AATGTAATTG 720 

TAATTTTQAA AT6TG0TTTT CCTGAAATCA 6GTCATCTAT AGTTQATATG TTTTATTTCA 780 

TTOGTTAATT TTTACATGGA GAAAACCAAA ATGATACTTA CTGAACTGTG TGTAATTGTT ' 840 

CCnTATTTT TTTGGTACCT ATTTGACTTA CCATGGAGTT AACATCATGA ATT TATTG CA 900 

CATTGTTCAA AAG6AACCAG GAGGTTTTTT TGTCAACATT GTGATGTATA TTCCTTTGAA 960 

GATAGTAACT GTAGATG6AA AAACTT6T0C TATAAAGCTA GATGCTTTCC XAAATCA6AT 1020 

GTTTTGGTCA AGTAGTTTGA CTCAGTATAG GTA6GGAGAT ATTTAAGTAT AAAATACAAC 1080 

AAAGGAAGTC TAAATATTCA GAATCTTTGT TAAQOTCCTO AAAGTAACTC ATAATCTATA 1140 

AACAATGAAA TATTGCTGTA TAGCTCCTTT TGACCTTCAT TTCATGTATA GTTTTCCCTA 1200 

TTGAATCAGT TTCCAATTAT TTGACTTTAA TTTATGTAAC TTGAACXTTAT GAAGCAATGG 1260 

ATATTT6TAC TGTTTAATGT TCTGTGATAC AGAACTCTTA AAAATGTTTT TTCATGTGTT 1320 

TTATAAAATC AAGTTTTAAO TGAAAGTGAG GAAATAAAGT TAAGTTTGTT TTAAAAAAAA 1360 

AAAAAAAAAA 



Seq ID NOt 259 Protein sequence: 
Protein Accession St NF_002349.1 

1 11 21 31 41 51 

I I I I I I 

MALQLSREQO ZTLSGSABIV AEPPSFGIHS ILyQRGIYPS ETFTRVQKYG LTLLVTTDLE 60 

UlKmWWZ QLKDHLYKC8 VQICLVWISN ZBSGEVLERN QFDXECDKTA XDDSAPRSK8 120 

QKAIQ0EIR8 VIRQZTATVT FX«PLLBVSC8 FDUiIYTOKD LVVFERHEBS GPQPI139SEB 180 
VRLRSPTTTI BKVNSMVAYK IPVMD 

Seq ID NO: 260 DKA sequence 
nucleic Acid Accession $t NM_001211 
Coding sequence: 43.. 3195 

1 11 21 31 41 51 

AAAGGCCTGC AGCAG6A0QA 6GACCTQA6C CAGGAATGCA GGATG6066C GGTGAAOAAG 60 

GAAGGGCCTG CTCTGAGTGA AGCCATGTCC CXGGAGGGAG ATGAATX3GGA ACTGAGTAAA 120 

GAAAATGTAC AACCTTTAAG GCAAGGGCGG ATCATGTCCA OGCTTCAGGG AGCACPGGCA 180 

CAAGAATCTG CCT6TAACAA TACTCTTCAG CAGCAGAAAC GGGGATTTGA ATATGAAATT 240 
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CGATTTTAC^ CTGGAM^TGA COCTCTGGAT 6TTTGGGATA G6TATATCAG CT6GACAGAG 300 

CAGAACIATC CTCAAGGTQG 6AAAGAGAGT AATATGTGAA GGTTATTAGA AAGAGCTGTA 360 

GAAGCACTAC AAGGAGAAAA AOSATATTAT AGTGATCCTC GATTTCTCAA I'Ci'ClX J G L Tf 420 

AAATTAGGGC GTTTATGCAA TGAGCCTTTG GATATGTACA GTTACTTGCA CAACCAAGGG 480 

ATTGGTGTTT CACTTGCTCA GTTCTATATC TCATGGGCAG AAGAATAT6A AGCTAGAGAA 540 

AAGTTZAG6A AAGCAGAT6C GATATTTCAG GAAG6GATTC AACAGAAGGC TGAAOCACTA 600 

GAAAGACTAC AGTCCCAGCA C08ACAATTC CAAGCTG6AG TGTCTCX36CA AACTCTGTTG 660 

GCACTTGAGA AAGAAGAAGA GGA£3GAAGTT TTTGAGTCTT CPGTACCACA AOGRAGCACA 720 

CTAGCTGAAC TAAAGAGCAA AGGGAAAAAG ACAOCAAGAG CTCCAATCAT CXDGTGTAGGA 780 

G6TGCTCTCA AGGCTCCAA6 0CAGAACA6A OGACTCCAAA ATCOITTTCC TCAACAGAT6 640 

CAAAATAATA GTAGAATTAC T0TTTTT6AT GAAAATGCT6 ATGAG6CTTC TACA6CAGAS 900 

TTGTCTAAGC CTACAQTCCA GCCATGGATA 6CACCCCCCA TGCCCAGGGC CAAAGAGAAT 960 

GAGCTGCAAG CAGGCCCTTG GAACACAGGC AGGTCCTTGG AACACAGGCC TOGTGGCART i020 

ACAGCTTCAC TGATAGCTGT ACCCGCTGTG CTTCCCAGTT TCACTCCATA TGTGGAAGAG 1080 

ACTGCACAAC AGCCAGTTAT GACACXATGT AAAATTGAAC CTAGTATAAA CCACATCCTA 1140 

AGCACCAGAA A6CCTGGAAA GGAA6AAGGA GATCCTCXAC AAAGGGTTCA GAOGCATCAG 1200 

CAAGCGTCTG A6QA6AA6AA AGAGAAGATQ ATGTATTGTA AGGAGAAGAT TTAT6CAGGA 1260 

6TAGGGGAAT TCTCCTTTGA AGAAATTOGG GCTGAAGTTT TCCGGAAGAA ATTAAAAGAG 1320 

CAAAGGGAM? CCX3AGCTATT GACCAGTGCA GAGAAGAGAG CAGAAATGCA GAAACAGATT 1380 

GAAGAGATGG AGAAGAAGCT AAAAGAAATC CAAACTACTC AGCAA6AAA6 AACAG6TGAT 1440 

CAGCAAGAAG AGAOGATGCC TACAAAGGAO ACAACTAAAC TGCAAATTGC TTCOGAGTCT 1500 

CAGAAAATAC CAGGAATGAC TCTATCCAGT TCTGTTTGTC AAGTAAACTG TTGTGCCAGA 1560 

GAAACTTCAC TT6CGGAGAA CATTTGGCAG GAACAACCTC ATTCTAAAGG TCOCAGrGTA 1620 

CCTTTCTCCA TTTTTGAT6A G ' mvnViT TCAGAAAAGA AGAATAAAAQ TCCTC3CTGCA 1680 

6ATCCCCXAC GAGTTTTAGC TCAACGTUUSA CXCCTTGCAG TTCTCAAAAC CTCAOAAAGC 1740 

ATCACCTCAA ATGAAGATGT GTCTCCAGAT GTTTGTGATG AATTTACAGG AATTGAACCC 1800 

TTGAGCGAGG ATGCCATTAT CACAGGCTTC AGAAATGTAA CAATTTGTCC TAACCCAGAA 1860 

GACACTTGTO ACTTTGCCA6 AGCAGCTCGT TTTGTATCCA CTCCTTTTCA TGAGATAATG 1920 

TCCTTGAA6G ATCTOCCTTC T6ATCCTGA6 AGACTGTTAC CXKaUVGAAGA TCTAGATGTA 1980 

AAGACCTCTG AGGACCAGCA GACAGCTTGT GGCACTATCT ACAGTCAGAC TCTCAGCATC 2040 

AAGAAGCTGA 6CCCAATTAT TGAAGACAOT CGTGAAGCCA CACACTCCTC TGGCTTCTCT 2100 

GGTTCTTCTG CCTCGGrTGC AAGCACXTTCC TCCATCAAAT GTCTTCAAAT TCCTGAGAAA 2160 

CTAGAACTTA CTAATGAGAC TTCAGAAAAC CCTACTCAGT CACCATGGTO TTCACAGTAT 2220 

O6CAGACA0C TACTGftAGTC CCTAGCA6A0 TTAAGTGCCT CT6CAGAGTT 6TGTATAGAA 2280 

GACAGACCAA TGCCTAAGTT GGAAATTGAS AAGGAAATTG AATTAGGTAA TCAGGATTAC 2340 

TGCATTAAAC GAGAATACCT AATATGTGAA GATTACAAGT TATTCTGGGT GGCGCCAACA 2400 

AACTCTGCA6 AATTAACAGT AATAAAGGTA TCTTC TCAA C CTG TCCCAT G GGACTTTTAT 2460 

ATCAACCTCA AGTTAAAGGA A0G7TTAAAT GAAGATTTTG ATCATTTTTO CAQCTGTTAT 2520 

CAATATCAAG ATGGCTQTAT T6TTTGGCAC CAATATATAA ACTGCTTCAC CCTTCAG6AT 2580 

CTTCTCCAAC ACAGTGAATA TATTACCCAT GAAATAACAG TGTTGATTAT TTATAACCTT 2640 

TTGACAATAG TGGAGATGCT ACACAAAGCA GAAATAGTCC ATGGTGACTT GA6TCCAAGG 2700 

TGTCTGATTC TCASAAACAG AATCCAOGAT CCCTATGATT GTAACAAGAA C AATCAA GCT 2760 

TTGAASITAG TG6ACTTTTC CTACAGTGTT GACCTTAGGS TGCA6CTGGA T6TTTTTA0C 2820 

CTCA606QCT TTCGGACrGT ACAGATCCTG GAAGGACAAA AGATCCTGGC TAACTGTTCr 2880 

TCTCCCTACC ABGTAGACCT GTTTGGTATA GCAGATTTAG CACATTTACT ATTGTTCAAG 2940 

GAACACCTAC AGGTCTTCTG GGATGGGTCC TTCTGGAAAC TTAGCCAAAA TATTTCTGAG 3000 

CTAAAAGATG GTGAATTGTG GAATAAATTC TTTGTGCX3GA TTCTGAATGC CAATGATGAG 3060 

GCCACAGTGT CT6TTCTTGG GGA6CTTGCA GCASAAAIGA ATGGGGTTTT TGACACTACA 3120 

TTCCAM6TC ACCTC»ACAA AGCCTTATGQ AAGQTA66GA A6TTAACTA0 TCCT06G6CT 3180 

TTGCTCTTTC AGTGAGCTAG GCAATCAAGT CTCACAGATT GCTGCCTCAG AGCAATGGTT 3240 

GTATTGTGGA ACACTGAAAC TGTATGTGCT GTAATTTAAT TTAGGACACA TTTAGATGCA 3300 

CTACCATTGC TGTTCTACTT TTTGGTACAG GTATATTTTO ACGTCACTGA TATTTTTTAT 3360 

ACAGTGATAT ACTTACTCAT GGCCTTGTCT AACTTT7QTG AAOAACTATT TTATTCTAAA 3420 

CAGACTCATT ACAAATGOTT ACCTTGTTAT TTAhCCCATT TSTCTCTACT TTTCCCTGTA 3480 

CTTTTOCCAT TTGTAATTTQ TAAAATGTTC TCTTATGATC ACCATGTATT TTGTAAATAA 3540 
TAAAATAGTA TCTQTTAAAA AAAAAAAAAA AAAAAAAAAA AAA 

Seq ID NO: 261 Protein eequencei 
Protein Accession #i NP_001202 

1 11 21 31 41 51 

I I 1 i I I 

MAAVKKEOGA LSEAMSLBGD EHBLSKE3IVQ PLRQ6RXMST IiQGAIiAQBSA OINTLQQQKR 60 

AFEYBIRFYT GHDPLOVHDR YISHTBQNYP Q66KBSHMST UiERAVBALQ GSKRVySDFR 120 

FLKUfLKLGR LCNSPLDMYS YLHNQGIGVS LAQFYZSHAB BYEARQIFRK ADAIFQEGIQ 180 

QXABPLERLQ SQHRQPQARV SRQTLLALEK EEEZEVFBSS VPQRSTLAEL KSKGKiCrARA 240 

PZIRVGOALK APSQNRGIiQN PPPQQMQHN5 RITVFDE32AD EASTAELSXP TVQPWIAPPM 300 

PRAKEHELQA GPHMTGRSLE ERPRGNTASL lAVPAVIiPSF TPyVBBTAQQ FVMTPCXXEP 360 

SZNHZLSTRR PGKEEGDPLQ RVQSHQQASE EKKEKHKYCK ERIYAGVGBP SFEB IRAEV F 420 

RKKLKEQREA ELLTSAEKRA EMQKQIEEME KKLKEIQTTQ QERTGDQQEB TMPTKETTKL 480 

QIASESQKIP QMTLSSSVCQ VNCCARETSL AQIIWQBQPH SKGPSVPFSI FT3BPU*SEKK 540 

NRSPPADPPR VLAQRRPLAV LKTSESITSN EDVSPDVCDE PTGIBPLSED AIITGFRNVT 600 

ICPNPEDTCD FARAARFV67 PFBEIMSLKD LPSDPERXiLP EEDXiDVKTSE DQQTAOGTIY 660 

SQTXiSIKKLS PIIEDSREAT HSS6FSGSSA SVASTSSIXC LQIPSKLELT NKTSEKPTQS 720 

PKC8QYRRQL LKSLPELSAS AELCIEDRPM PKLEIEKEIE LGNEDYCIKR BYLICEDYKL 780 

FWVAPRNSAE LTVIKVSSQP VPWDFYINLK LKERUIEDFD HPCSCYQYQD GCIVWHQYIN 840 

CFTX^DZjUQH SEYITHEXTV LIIYNLLTIV EMIJQCAEIVB GDLSPROjZL RNRIEDFYDC 900 

moiKQALKZV DFSYSVDLRV QLDVFTLSGF RTVQIXiBQQK lUUaCSSFYQ VDIiFGZADLA 960 

HLLIiFKEHLQ VPHDG5FWKL SQNISELKDG BLMHKPPVRI UmNDEATVS VL6ELAAB0I 1020 
GVFDrrFQSB LHKALWKVGK LTSPGAUjFQ 

Seq ID NO: 262 DNA sequence 
Nucleic Acid Accession #: NM_003784 
Coding sequence: 365.. 1507 

1 11 21 31 41 51 
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I I I I I I 

GTCIACTTAT C3UltAAGCAC CT G CCTCT G C A®GT0CAG6 CTGCACCTTT GCTCTGCCTT 60 

TAAAACTGAA TTCTCAGAAT TTTAGAACAA ATTTTTGTCT AGAAATGCTG ACTTTGGTTC 120 

ATTAGGTAGT GGTAAAACAG GCTCCCTTCG AAOCTCTCCT TCATCACCTT CCTAAGTGCA 180 

TGTACAGGGA AGCTCTCCTT CATCACCTTC CTAAGTGCAT GGGGGAAAAT ACCTAGGGCT 240 

CAACAGTCTT 6AGAAGTGTG GAAACATTTT CTTTGTGAGT GAGAACAGAT CAGCTAGAGA 300 

AAGGAAACOl GATTCCCATC ACTGCTTCTG G6TATCAGAT GCTAGOGCTG CACTCCATTT 360 

tGCAATGGCC TCCCTTGCTG CAGCAAATGC AGAGTTTTGC TTCAACCTGT TCA6AGAGAT 420 

GGATGACAAT CAAGGAAATG GAAATGTGTT CTTTTCCTCT CTGAGCCTCT TOGCTGCCCT 480 

GGC30CTGGTC 0GCTTQG0C3G CTCAAGAIGA CTCCCTCTCT CAGATT6ATA AGTTGCTTCA 540 

TGTTAACACT GCCTCAG6AT ATGGAAACTC TTCTAATAGT CAGTCAGGGC TOCAGTCTCA 600 

ACTGAAAA6A g m iT A'C T G ATATAAATGC ATCCCACAAG GATTATGATC TCA6CATTGT 660 

GAATGGGCTT TTTGCTGAAA AAGTGTATGG CTTTCATAAG GACTACATTS AGTGTGCOGA 720 

AAAATTATAC GATGCCAAAG TGGAGCGAGT TGACTTTACG AATCATTTAG A AGACACTAG 780 

ACGTAATATT AATAAGTOGG TTGAAAATGA AACACATGGC AAAA TCRAGA Aa3TGATTGG 840 

TGAA GG TOGC ATAAGCTCAT CTGCTGTAAT GGT G CT GG T G AAT6CTGTGT ACTTCAAAOG 900 

CAAOTGGCAA TCAQCCTTCA CCAAGAGOGA AACCATAAAT TGCCATTTCA AATCTCCCAA 960 

GTGCTCTOGO AAGGCAGTCG CCATGATGCA TCAGGAAGGG AAGTTCAATT TGTCTGTTAT 1020 

TGAGGAOCCA TCAATGAAGA TTCTTGAGCT CAQATACAAT GGTGG CATA A ACATGTACGT 1080 

TCTGCTGCXrr GAGAATGACC TCTCTGAAAT TQAAAACAAA CTGAC CTTTC AGAATCTAAT 1140 

GGAATGGACC AATCCAAGGC GAATGACCTC TAAOTATGTT GAGGTATTTT TTCCTCAGTT 1200 

CAAGATAGAO AA6AATTAT6 AAATGAAACA ATATTTGAGA GCCCTAGGGC TGAAAGATAT 1260 

CTTTGATQAA TCCAAAGCAG ATCTCTCTGG GATTGCTTCG GGGGGTOGTC TGTATATATC 1320 

AAGGATGATG CACAAATCTT ACATAGAGGT CACTGAGGAG GGCACOQAGO CTACTGCTGC 1380 

CACAGGAAGT AATATTGTAG AAAAGCAACT CCCTCAGTCC ACGCTGTTTA GAGCTG ACCA 1440 

CXXATTCCTA TTTGTTATCA GGAAGGAT6A CATCATCTTA TTCAGTGGCA AAGTTTCrXG 1500 

CCCTTGAAAA TCCAATTGGT TTCTGTTATA GCA6TCCCCA CAAOITCAAA GRACCACCAC 1560 

AAGTCAATAG ATYTGRGTTT AATTGGAAAA ATGTGGTGTT TCCTTTGAGT TTATTTCTTC 1620 

CTAACATTGG TCAGCAGATG ACACTGGTGA CTTGACCCTT CCTAGACACC TGGTTGATTG 1680 

TCCTGATCCC TGCTCTTAGC ATTCTACCAC CATGTGTCTC ACCCATTTCT AATTTCATTG 1740 

TCrrrCTTCC CACGCTCATT TCTATCATTC TCCCCCATGA CCOGTCTGGA AATTATGGAO 1800 

RGTGCTCAAC TGGTAAGGAG AAOGTAGAAG TAGCCCTAGG GATCCTTTTT GAAACTCTAC 1860 

AGTTATOSCA GATATTCTAO CTTCATTGTA A6CAATCTAG 6AAATAAGCC CTGCTGCTTT 1920 

CTAGAAATAA CTGTQAAGGA TAAATTTTCT TTGTT6ACCT ATGAAGATTT TAGAGTTTAC 1980 

CTTCATATGT TTGATTTTAA ATCAGTGTAT AATCTAGATG GTAAAAAATG TGAAATTGGC 2040 

ATTAGGGACC TACCAAAATA TTTCATTAAT GCTTTCAATT GACAAATTTT GGCCTTTCTT 2100 

TGATAAGACA ATATGTACAT GTTTTTTCAA ATATTAAAGA TCTTlTAACr GTTGGCAGTT 2160 

GTTATCTACA GAATCATATT TCATAT6CT0 TGTA6TTTAT AAGrPTTTOC TCTATTTATC 2220 
AGAATAAA6A AATACAACAT ACCTGTAAA 

Seq ID NO: 263 Protein sequence : 
Protein Accession fti NP 003775 



1 11 21 31 41 51 

t I 1 I I 1 

KASLAAANAB FCFNLFREMD DNQGNGNVPF SSLSIiFAALA LVKLGAQDDS LSQIDKUiHV 60 

NTASGYGIfSS NSQSQLQSQL KRVFSDXNAS HXDYDIiSZVK GLFAEKVYGF HKDYZECAER 120 

LYZ3AKVERVD FTOKLEDTRR NINKHVENET KGKXKNVXGB G6ZS8SAVHV LVMAVYFXGK 180 

WQSAPTXSBT IKCKFKSPKC SGKAVAMMHQ ERKFNLSVIE DPSMKILELR YNGGINMYVL 240 

LPENDliSBIB NKLTFQNLME WTNPRRMTSK YVBVFPPQFK IBKNYEMKQY LRALGLKDIP 300 

DESKADLSGI ASGGRLYISR KMHRSYIEVT EBGTEATAAT GSKIVEKQLP QSTLFRADHP 360 
FLFVIRKDDI ILFSGKVSCP 

Seq ID NO: 264 DNA sequence 
Nucleic Acid Accession ft; AB052906 
Coding sequence: 74-814 

1 11 21 31 41 51 

I I I I ) i 

AAAACCTTGA GGTGATTCAT CTTCCAGGCT CTCCTTCCAT CSiAGTCTCTC CTCCCTAGCG ' 60 

CTCIGGGTOC TTAATGGCAG CAGCCGCCGC TACCAAGATC CTTCT6TGCC TCCOGCTTCT 120 

G CTOCT OC TG TCOG G CTGGT CCCGOGCTGO 60GAGC0GAC CCTCACTCTC TTTGCTATGA 160 

CATCAOOGTC ATCXXn*AAGT TCAGACCTOO ACCAOGOTOO TOTGCGGTTC AAGOCCAGGT 240 

GGATGAAAAG ACTTTTCTTC ACTATGACTG TGGCAACAAG ACAGTCACAC CTOTCAGTCC 300 

CCTGGGGAAG AAACTAAATG TCACAACGGC CTGGAAAGCA CAGAACCCAG TACTGAGAGA 360 

GGT0GTG6AC ATACTTACA6 AGCAACTGCX3 T6ACATTCAG CZGQAGAA1T ACACACCCAA 420 

GOAACOOCTC ACOCTGCAOG CCAGGAT6TC TT6TGA6CAG AAAGCTQAAO GACACAGCA6 480 

TGGATCTTGG CAGTTCA6TT TOGATGGGCa GATCTTCCTC CTCTTTGACT C3U3AGAAGAG 540 

AATGTGGACA AOSGTTCATC CTGGAGCCAG AAAGATGAAA 6AAAAGTGGG AGAATGACAA 600 

GGTTGTGGCX: ATGTCCTTCC ATTACTTCTC AATGGGAGAC TGTATAGGAT GGCTTGAGGA 660 

CTTCTTGATO GGCATGGACA GCACCCTGGA GCCAAGTGCA 6GAGCACCAC T08CCATGTC 720 

CTCAGGCACA ACGC3kACTCA 6GGCCACAGC CACCACCCTC ATCCTTTGCT GOCTOCTCAT 760 

CATCCTCCCC TGCTTCATOC TCCCTGGC31T CTGAGGAGAO TCCTTTAGAG TGACAGGTTA 640 

AAGCTGATAC CAAAAGGCTC CTGTGAGCAC GGTCTTGATC AAACTCGCCC TTCTGTCTGG 900 

CCAGCTGCCC ACQACCTAOG GTGTATGTCC AGTGGCCTCC AGCAGATCAT GATGACATCA 960 

TG6A0CCAAT AGCTCATTGA CT6CCTTGAT TCCTTTTGOC AACAATTTTA CCAGCAGTTA 1020 

TACCIAACAT ATTATGCAAT mVi ' ClltaU TGCTAOCIOA T6GAATTCCT GCACTTAAAO 1080 

TTCTGGCTGA CTAAACAAGA TATATCATTT TCTTTCTTCT CTTTTTGTTT GGAAAATCAA 1140 

GTACTTCTTT GAATGATGAT CTCTTTCTTa CAAATGATAT TGTCAGTAAA ATAATCAOST 1200 

TAGACTTCAG ACCTCTGGGG ATTCPTTCCG TGTCCTGAAA GAGA ATTTTT AAATTATTTA 1260 

ATAAGAAAAA ATTTATATTA ATGATTGTTT CCTTTAGTAA TTTATTGTTC TGTACTGATA 1320 
TTTAAATAAA GAGTTCTATr TCOCAAAAAA AAAAAAAAAA A 

Seq ID KOt 265 Protein sequence t 
Protein Accession 4i BAB61048.1 



288 



wo 02/086443 PCT/US02/12476 

1 11 21 31 41 51 

I I I I I I 

MAAAAATKIL LCLPLLLLLS GWSRAGRADP ESLCYDITVI PKFRPGPHKC AVQGQVDEKT 60 
PLHYDOaJKT VTPVSPLGKK LNVTTAKKAQ KPVIiRBWDI LTEQIJlDIQIi ENYTPKEPIiT 120 
5 LQARMSCEQR AEC^SGSHQ FSFDGQIFZiL FDSEXRKHTT VHPGARKMKE KWEKDKWAM 180 
SFBYFSMGDC X6WLEDFLK0 MDSTLEPSAG APLAHSSGTT QIiBATATTLI IiCCUiIXZiPC 240 
FXLPGX 

Seq ID KO: 266 ONA sequence 
10 Nucleic Acid Accession i: XN_oe4e53.1 
Coding sequence X 127-444 

1 11 21 31 41 51 

I I I I I I 

I J ATTGATGATA TATTTAA06A AATCAAATTT GGTGAATATG TGGACACTGG AAAGCTAATC 60 

6ACAAGATCA ACTTAOCAGA TTTOCTAAAA GTGTAOCTTA AGCACAAGCC ACCTTTTOQT 120 

AACAOCATGA GT6GCATCCA CAAGAGCTTT GACGTGCTGG 6TTATA0CAA CTCCAAAOOO 180 

AAAAAGGCCA TTOGAAGAGA GGACTTCCTG AGACTGCTOS TTACTAAAGG TGAGCATATG 240 

AOOGAGGAGG AGATC3TTGQA TTGCTTTGCT TCACTGTTTO GOCTGAATCC CGAGGGATGG 300 

20 AAATCCGAGC CTGCAACCTG CTCOGTCAAA GGTTCAGAAA TTTeOCTTGA AGAAGAACTT 360 

CXAGACGAAA TCACTGCAGA AATATTCX^OS ACTGAAATTC TTGGCTTAAC CATT TCAGA A 420 

GATT0006CC AGQATGGTCA GT6AAGTTAC CAGGAAT6TT TAAAGCACAA AGGACTTTGO 480 

GTGTGTGTGC AT6C3VCATGT GTGTGTTTTC • CATGAGGCAC TQCTTTTTAT GCATTTCGCT 540 
OOCXXXTCrC ATCTTTAGAA CATTTAGACA TTAAAGCAAO TTTCTGGT6A GCAAT6 



25 



8S 



Seq ID NO: 267 Protein sequence i 
Protein Accession ft: XP_084eS3.1 



30 1 11 21 31 41 51 

I I i I I 1 

f4S0IHRSFSV liGYlNSKGKK AIRREDFLRL LVTKGEE^f^E EEMLDCFASL PGLNPE6WKS 60 
EPATCSVK6S EICLEEELPO EXTAEIPATB ILGIiTISEDS GQDOQ 

35 Seq ID KO: 268 DMA sequence 

Nucleic Acid Accession ii NN_001898 
Coding sequence: 57-482 

1 11 21 31 41 51 

40 I I I I I I 

GGCTCTCACC CTCCTCTCCT GCAGCTCCAa CTTTGTGCTC TGCCTCTGAG GAQACCATGG 60 

CCCAGTATCT GAGTACCCTG CTGCTCCTOC TGGCCACCCT AGCTGTGGCC CTGGCCTGGA 120 

GCOCCAAGGA GGAGGATAGG ATAATCOOGG GTGGCATCTA TAAGGCAGAC CTCAATQATG 180 

AGTXXSGTACA GOGTGCCCTT CACTTC60CA TCAG0GA6TA TAACAAGGCC AOCAAAGATG 240 

45 ACTACTACAG AOGTCCGCTG OGGGTACTAA GAGCCAGGCA ACAGACCGTT GGGGGGGTGA 300 

ATTACTTCTT CGACGTAGAG GTGGGCCGCA CXIATATGTAC CAAGTCCCAG CCCAACTTGG 360 

ACACCTGTGC CTTCCATGAA CAQCCAGAAC TGCAGAAGAA ACAGTTGTGC TCTTTCGAQA 420 

TCTAGGAAOT TCCCTGG6AG AACAGAAGGT CCCTGGTGAA ATCCAG6TGT CAAGAATCCT 480 

AGGGATCTGT OCCAGGCCAT TG6CACCAGC CACCACOCAC TCCCACOCCC TGTAGTGCTC 540 

50 CCACCCCTGG ACTGGTGGCC CCCACCCTGC GGGAGGCCTC CCCATOTGCC TGCGCCAAGA 600 

GACAGAOWSV GAAGGCTGCA GGAGTCCTTT GTTGCTCAGC AGGGCGCTCT GCCCTCCCTC 660 

CTTCCTTCTT GCTTCTAATA OCCCTQGTAC ATGGTACACA CCCCCCCACC TCCTGCAATT 720 
AAACAGTAGC ATOGCC 

55 Seq ZD NO: 269 Protein sequence i 
Protein Accession <xHP_0018e9.1 

1 11 21 31 41 51 

OU MAQYLSTLLL LLATLAVALA HSPKBEDRXX PGGiyNADLH DEWVQRAIiHP AISSYNKATK 60 
DDYYRRPLRV LRARQQTVGG VNYFFDVEVG RTICTK5QFN U>TCAFHBQP BLQKKQXiCSP 120 
BIYEVPHENR RSLVKSRGQB S 

^ Seq ID NO: 270 DBA sequence 

65 Nucleic Acid Accession ftt XM_093210 
Coding sequences 13-1854 

1 11 21 31 41 51 

^^1 I 1 1 I I 

/O ATGGCAAGOQ OOOGAATCTC CTCAGCTGCC GTTTCACAAA AGA6GTA0CA GGTCCSCACC 60 

AAAOGAGCAC ACAAGCAGCA CCAGGAGCTG CAGAAGAAGG AGGOGGCAGC GATGGACCAO 120 

66CAGAGGGA AT6G6GAGGG GGCATCCTAC CCCATATCT6 AG6T60GACT GCGGGACSTA 160 

6A60S6ACT6 GGCCTTTOCC GTTGGOOCGT 6GCCTCAATC AOQACTTCTT GCCCACG7GC 240 

GCCTTCAAAA CGGTAA6AGC TGCAACTGAA C6TGTGAGAC ATG6T6CAGA TAGGC7GAGA 300 

75 G6C660GGGA QAGATGCCCA TGAACTCAAG TACC0G6ACA G6CCCTCCAC TTCXACCACC 360 

ACGAGTAACA CCQCCCCCAC G6GAC06CTC TCGAGGTCCC CCAAGCCAAG GACGCAAGGA 420 

GGAACGCCCC GGC60GCGGC CAGCAGCGGC GGGCAOOGGC CCAATGOGCA 06GAACTCAG 480 

CACTGGCAGT CGGCOCTCCT CACACOQCAG GCSIGCAfiTG TGGCC6ACG6 AGCCTOOGGO 540 

0 0C0GAG6ACC CA6CTA0GCC GTCACCCOQG T T QCTOOCAC GG6AAG6GGC ACCA6GCAAA 600 

oO CTGCCCAAGG CCCCGA6CCC AOOCTGCCTG GOQGAOGCCT COG C fGGTCC CGCOCAGATC 660 

ATGGCCGCCA CCAGOCTOCC QA60CATGGC TTCCTG T CCG GGAAOGOOOC QOOSTCCTOO 720 
CTGTCCAGCT AG 



Seq ID NO: 271 Protein sequence: 
Protein Accession Ut XP 093210 



21 31 41 51 



289 
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WO 02/086443 
I I I I I I 

MLBRGBQKHK SASKKHDFIiP TCAFKTVBM TB HV K U OMM LSGOGSDABB LKyPDTPSTS 
TTTSNTAPTO PLSSSPKFST QGGTPSSRPA AAGTRAN6BQ TQHMQSALLT PQACSVADGA 
SRAEDPARPS PRUiPRGGAP GKLPKAPSPG SIAEASAGLL ARVRLQHAIA QRVSISQALP 
PHSSVCSKBB RPCMSOOItSA PAPMATELST GSRPSSHSBB AVNPTEPPGP RTQLBPSPSL 
tfRBBKPGKL PKAP8PGSIA BASACPAQIN AATBLPSRGP LSGHGPASHI. SS 

Seq 10 NOt 272 OMA sequence 

Nucleic Acid Accession it Eos eequesce 

Coding sequence: 1..732 



PCT/US02/12476 



1 

GGATACTGT6 
T6AAAAASCT 
TAATGTGGAG 
ATACCCACTT 
ATGATTTTGT 
TAAATTATTT 
TTAGTATCAC 
AAAATTGCAO 
TTAAGCC 



XI 

1 

TCACTCAAAG 
TTTTTTCCCA 
GAAATTATTC 
GAAGCCTCT6 
CTTGTTTCTG 
TTATTTATCT 
AATTTATGGO 
AAGTCATA06 



21 
I 

TAATGGGAGG 
CTTTTAACTT 
TTTCTCATTG 
TA6AAATGTC 
CAGTX3AGAAA 
TTCATATAQT 
AGAGGGTTTT 
ACT6TCATGT 



31 

I 

GAGAGAGAAC 
6CTTTAG0GT 
6AGATTACAG 

TOSTcxrrcoG 

TTACATCCAT 
TCTTACAArr 
TTGTATTTTT 
AnOCAGCTC 



41 

I 

AGGGAGGGTA 
TAAGAGTACT 
AATATA TCTA 
GTTGTATTTC 
AGCAAAGACA 
TCTAAAAAAT 
AA6CATATGT 
TGA6AACCAA 



Seq ZD NOt 273 Protein sequence: 
Protein Accession #: Eos sequence 

1 11 21 31 

I i I I 

MGGRENRE6R DAFEKAFFFT FNLL 

Seq ID MOt 274 DNA sequence 

Nucleic Acid Accession «: NM_003976.2 

Coding sequence t 299-961 



41 

I 



51 



CTCTGAGCTT 
CATGGAGTTG 
CTACTTCTGC 
GGGTG6CAGQ 
CAGGAGGGTG 
GGAAC7TGGA 
TGCCCTGTGG 
GGGCTCCGOG 
OGCCGGCCAC 
OCOGC06CAG 
OSGGQGCOOC 
GG6CT6CCGC 
0GACGA6CTG 
CGACCTCAGC 
GCCOGTCAGC 
CAACAOCACC 
AG06CT0GCT 
CCTCCCGCAO 
AGGCCCCTAC 
CAGC CCCAGA 
GGAGCCCTTC 
COCTC CTCTQ 
ACAGCATTTO 
CCTGTACTCA 



11 
I 

CTCTGAGCCT 
TGAAAGAATA 
TGGGTTGA6T 
COGGTCCCCC 
G6GGAACAGC 
CTTGGAGGCC 
CCCACCCTGG 
CCCCGCAGCC 
CTGCOGGOGG 
CCTTCTOGGC 
GCGGC60SG0 
CTGCGCTOGC 
GTGCGTTTTC 
CTGGCCAGCC 
CAGCCCTGCT 
TGQAGAAOOO 
CCAGG6CTTT 
AGTCCCACTA 
CGGTGGGTGA 
GCCCTCACCC 
QQAGCCACTT 
ATGAACACTA 
AAGGACACAT 
CTCATGGGAG 



21 

I 

TGTTTGCTCA 
GCTGCAAAGC 
CTAGCrGTOT 
ACAAAAGATA 
TCAACAATGG 
TCTCCACGCT 
CCGCTCTGGC 
CTGCCCCCCG 
6ACGCACGGC 



CTGGGOGCCC 
AGCTGGTGCC 
GCTTCTX3CAG 
TACTGGGCGC 
GCCGACCCAC 
TGGAOOGCCT 
GCAGACT06A 
GCCAGCG6CC 
TGGATATCAT 
TGC6GATCCC 
CTCACAGACT 
CAOTOGCTGA 
ATT6CAGTTG 
CT06CC0C 



31 

I 

TCTGGAAAAA 
ACCTAACACA 
AG3CCCCTTG 
ACTCATCTCT 
CTGATGGGCG 
GTCCCACTGC 
TCTGCTGAGC 
OSAAGGCCCC 
C06CTGGTGC 
GC06CCTGCA 
6G6CAGC06C 
(36TG0Q0G0G 
0G6CTCCTGC 
CGGGGCCCTQ 
6GQCTACXSAA 
CTC060CACC 
CX3CTTAOOC56 
TCAGCCAGGG 
CCCOGAACAG 
AQCCTA AAAG 
CTOOCACTGG 
GGCATCAO CC 
CTTC36TT6AA 



41 

I 

GGGGATTAAA 
TAGTAAGGTT 
TTCCTCAOCT 
TAATTTGCAA 
CTCCTGGTGT 
CCCTGGCCTA 
A60GT06CAG 
COGCCTGTCC 
AGTGGAAGAO 
CCOCCATCTG 
GCTO GGGCAG 
CrOGGCCTGO 
0GG0GG60GC 
CQACC6CCCC 
GOGGTCTCCT 
GCCIGOQGCT 
T66CTCTTCC 
ACGAA6GCCT 
GTGAAG6GAC 
ACACCAOAGA 
CCAGGOCTOQ 
CCCGCCCAOG 
ASTGCCrOTG 



51 
1 

CCATTTACCT 
CXXZAGTGCAQ 
6QAGAAACTG 
GCTGCCTCAA 
TGATAGAGAT 
GGCGGCAGCC 
AGGCCTCCCT 
TGGCGTCCCC 
CCCGGCX5GCC 
CTCTTCCCOG 
CGGGGGCGCX3 
GCCACCGCTC 
GCTCTCCACA 
OGGGCTGOOG 
TCATGGA06T 
GCCTGGGCTS 
T6CCTG66AC 
CAAAGCTGAO 
AACTGACTAG 
CCTCAGCTAT 
AACCTG6GAC 
C0CTGTAGG6 
CTGGAACTGG 



Seq ZD NO I 275 Protein sequence! 
Protein Accession It NP_003967.1 

1 11 21 31 41 51 

i I i I I I 

MSLGLOQLST LSBCFHPRRQ PALWPTIAAL ALLSSVAEAS liGSAPRSPAP REGPPPVLAS 

PAGBZiPQGRT ARHC8GRARR PPPQPSRPAP PPPAPPSALP SG6BAARAGG FGSSASAA6A 

RGCRIiRSQLV FVSAX.6I1CTR SDBLVRFRFC SGSCRRARSP BDLSLASZ1L6 AGALRPPPGS 
RPVSQPCCRP TRYEAVSFMD VN8TWRTVDR LSATA08CU3 

Seq ZD NOi 276 DMA sequence 

Nucleic Acid Accession «: NM.057091.1 

Coding sequence: 763-1445 ~ 



ACTGGCOGCT 
GGACCCCCAA 
TOGCTCCCCG 
OGOGTGl'CXA 
CTCCATATOC 
CAAOCXAGGG 
06GG6CAGG6 
CACOGGACGG 
CAGACAAGGC 



11 
I 

GAGAGAA6AA 
ATCTGCACGT 
CCCTCACTCA 
CAAACTCAAC 
GAOOGQCCOC 
GGQAC1X3GAT 
Q06CTC0CA0 
CT60GGC6GC 
COGGGGGCTC 



21 
I 

TCOGGTGGAa 
ACCAGCAGTC 
CTTTCTCCOG 
TCC06GTTTC 
TCCCAGCATC 
OGQAOGGGTO 
0CCCAOCC06 
GGGCAGGAGG 
G6CCA6CAGC 



31 
1 

CAGAG AGCRG 
AGC0C3C0CCA 
CCCTCGGOOC 
06TG0CTCTC 
TACOOOOCTC 
GA6CAGGCA6 
GGATCTGGTG 
CTGCTGAGGG 
A6GTG0CTO6 



41 

I 

CTGCT6CAG6 
06CAGG6ACC 
GGCCTCCCAO 
CAO03CTO6A 
CCMCCtOGSS 
GTGA600C0S 
ACGCTGGG6C 
ATCKsAGTTGG 
GGCCCCAGCC 



51 
I 

GCAGACA6CC 
GGCTTACCCC 
CTCTCTACTT 
GTTCTCTACT 
GOGACCTASC 
AAAGGTGGGO 
TGGAATTTGA 
GCCCGGCCCC 
CTOGCTGCCA 



60 
120 
180 
240 



51 

I 

GGGATGCTTT 
TACCAGCTAA 
TTCATCTTGA 
TAAAACCTAC 
AAAGTCTTTT 
TAACACTCAT 
6GCTTATATA 
T6GCTGAAAC 



60 
120 

leo 

240 
300 
360 
420 
480 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 



290 



wo 02/086443 

CCC0G6CCTG GAGCCCChOl OCOGAfiGGTG CAGACTGGCT G COAG GCXa CACTTTTG6C 600 

TAARAGAGGC ACTGCCA6GT GTACAGTCCT GGGCATGOGC TGTTTGftGCT TOGGGGGAGA 660 

GCCCAGCACT GGTCCCCGGA AAGGTGCCTA GAAGAACAAG GTGCAKSACC CC3GTGCTGCC 720 

TCAACAGGAG GGTGGGGGAA CAGCTCAACA ATGGCTGATG GGCGCTCCPG GTGTTGATAG 780 

AGATGGAACT TGGACTTGGA GGCCTCTCCA OGCTGTOOCA CTGCCCCTGG CCTAG6CG6C 840 

AGCCTGCCCT GTG6CCCACC CTGGC06CTC TGGCTCTOCT GAGCAGOGTC 6CA6AG6GCT 900 

CCCTGGGCTC CGCGCCCCGC AGCCCT6CCC CCCGCGAWSG CCCCCOGCCT GTCCTGGCGT 960 

CCCCCGCC3GG CCACCTGCOG GGGGGAOGCA CX^GCOOGCTG GTGCAGTGGA AGAGCCOGGC 1020 

GGCCGCCGCX: GCAGCCTTCT CX3GCCOGOGC CCCOGCOGCC TGCACCCCCA TCTGCTCTTC 1080 

CCX3G0GGGGG CX3GOG0GGO6 CGGGCTGGGG GC0C3GGGCAG CCGCGCTOGG GCAGOGGGGG 1140 

CGCGGGGCIG COGOCtG O G C TG6CA6CTGG TG0CX3GTGCS GGOQCTGGGC CTGGQCCACC 1200 

GCTCCGAOGA 6CTGGTG0GT TTCCGCTTCT GCAOC3G6CTC CTGC06C0GC GC60GCTCTC 1260 

CACACGAOCT CAGCCTGGCC AGCCTACTGG GCGCX»3GGC CCTGCGAOOG CCCCCGGGCT 1320 

CCCGGCCGGT CAGCCAGOOC TGCTGCCGAC CCACGCGCTA CGAAGCGGTC TCCTTCATGG 1380 

AOSTCAACAO CACCTGGAGA ACCGTGGACC GCXTTCTCCGC CACCGOCTGC GGCTGCCTGQ 1440 

GCTGAGGGCT CGCTCCAGGG CTTTGCAGAC TGGAOCCTTA COGGTOGCTC TTCCTGOCTO 1500 

GGACCCTCCX: GCAGAGTCOC ACTAGCCAGC GGCCTCAGCC AGGGAOGAAG GCCTCAAAGC 1560 

TGAGA66CCC CTACCGGTGG GTGATGGATA TCATCCCOGA ACAGGTGAAG GGACAACTGA 1620 

CTAGCAGCCC CAGAGCCCTC ACCCTGC6GA TCCCAGCCTA AAAGACACCA GAGACCTCAG 1680 

CTATGGAGCC CTTCGGACCC ACTTCTCACA GACTCTGGCA CTGGCCAGGC CTCGA ACCT G 1740 

GGACCCCTCC TCTGATOAAC ACTACAGTGG CTGAGGCATC AGGOCOCX^CC CAGGCCCTOT 1800 

AGGGACAGCA TTTGAA6GAC ACATATTGCA GTT6CTTGGT TGAAAGTGCC TGTGCTGGAA 1860 
CTQGCCXGTA CTCACTCAT6 G6AGCTGGGC CC 

Seq ID NO: 277 Protein sequence t 
Protein Acceesion #: NP_003 967.1 
1 11 21 31 

I * I I 1 

MEU3L06LST LSRCPWPRRQ PAUHPTLAAL ALLSSVAEAS 
PAGHLPGGRT ARWCSGRARR PPPQPSRPAP PPPAPPSALP 
RGCRLRSQLV PVRALGLGHR SDELVRPRPC SGSCRRARSP 
RPVSQPCXS^ TRYEAVSFMD VNSTWRTVDR LSATACGCL6 

Seq ID DO: 276 DMA sequence 
macleic Acid Accession fix iaN_057160.1 
Coding sequence: 1-714 

1 11 21 31 41 51 

I I I I I I 

ATGCCC3GGCC TGATCTCAGC CCGAGGACAG CCCCTCCTTO AGGTCCTTCC TCCCCAAGCX: 60 

CACCTGGGTG OCCTCTTTCT CCCTGAGGCT CCACTTGGTC TCTCCGOGCA GCCTGCCCTG 120 

TG6CCCACCC TGGCCGCTCT GGCTCTGCTG AGCA60GTCX5 CAGAGGCCTC CCTGGGCTCC 180 

GCQCC0C6CA QOOCTQCCCC CCX;OGAAGGC CCCCOGOCTS TCCTG60GTC CCCC6CCGGC 240 

CACCTGC060 GGGGA06CAC 66CCCGCTGG T6CA6TG8AA QAOCCOGGOG GCCGCOGCCG 300 

CAGCCTTCTC GGCCOGCGCC CCCGCOGCCT GCACCCCCAT CTGCTCTTCC CXS3CX3GGGGC 360 

CGCGCGGCGC GGGCTGGGGG CCCGGGCAGC CGCGCTOSGG CAGCGGGGGC GCGGGGCTGC 420 

CGCCTGCGCT aSCAGCTGGT OCOGGTGCGC GCGCTCGGCC TGGGCCACCJG CTCOGAOGAG 480 

CIXnSTGGGTT T006CTTCTG CAGCGGCTCC TGC0SC06CG GGOGCTCTCC ACAG8ACCTC 540 

AGCCTGGOCA GCCTACTG6G C96CC6G6GCC CTG05AC06C CCC0BG6CTC CCOQOCOGTC 600 

AGCCAGCCCT GCTGCCGACC CACGCGCTAC GAAGCGGTCT CCTTCATG6A CGTCAACAGC 660 

ACCTGGAGAA CC3GTGGACCG CXTTCTCCGCC ACOGCCTGOQ GCTGCCTGGG CTGAGGGCTC 720 

6CTCCAGG6C TTTGCAGACT GGACCCTTAC OGGTGGCTCT TCCTGCCTGG GAOCCTCCOQ 780 

CAGAGTCCCA CTAGCCAGOQ GCCTOVGCCA GG6AC6AA0G CCTCAAA6CT GAOaGQCCCC 840 

TACOGGTGGQ TOATGGATAT CATCCCOSAA CAGGTGAA06 GACAACTGAC TAGCAGCCCC 900 

AGAGCCCTCA CXXTTGCGGAT CCCAGCCTAA AAGACACCAG AiGACCTCAGC TATGGAGCCC 960 

TTCGGACCCA CTTCTCACAG ACTCTGGCAC TGGCCAGGCC TCGAACXTO5 GAOCCCTCCT 1020 

CTGATGAACA CTACAGTGGC TGAGGCATCA GCCCCCGCCC AGGCCCTGTA GGGACAGCAT 1080 

TTGAA06ACA CATATTGCAQ ri XjC Tl Wri' OAAAGTGCCT GIGCTGGAAC TGGCCTGXAC 1140 
TCACTCA1G0 QAGCTGaCCC C 

Seg ID NO: 279 Protein sequence: 
Protein Accession ft: NP_476501.1 

1 II 21 31 41 51 

t I I . I I I 

MPGIiISARGQ PLLBVLPPQA HLGALFLPEA PLGLSAQPAL WPTLAALALL SSVAEASLGS 60 

APRSPAPRB6 PPPVLASPAG HLPGGRTARW CSGRARRPPP QPSRPAPPPP APPSALPRG6 120 

RAASAG6PG8 RARAAQARGC RLRSQLVPVR ALGI43BRSDB LVRFRFCSGS OtRARSPflDL 180 
SIASLLGAGA LRPPPGSRPV SQPCCRPTRY BAVSFMDVHS TWRTVDRLSA TACGCU3 

Seq ID NO: 280 DNA sequence 

Nucleic Acid Accession ft: NM_057090.1 

Coding sequence: 29-715 " 

1 11 21 31 41 SI 

111)11 

CTGATGGGOG CTCCTGGTGT TGATAGAGAT GGAACTTGGA CTTGGAGGCC TCTCCAC6CT 60 

GTCCCACrOC C0CIG60CTA GGGQGCAQQC TOCACTTQ G T CTCTOOGOQC AGCCTGCCCT 120 

GTGGCCCACC CTGGCCGCTC TGGCTCTSCT GftGCAGCGTC GCAGAGGCCT CCCTGGGCTC 180 

CGOOCCCOGC AGCCCTGCCC CCOGCGAAGG CCCCCOGCCT GTCCTGGCGT CCCCCGCOQG 240 

CCACCTGCCG GGGGQACGCA OGGCCCGCTG GTGCAGTGGA AGAGCCCGGC GGCCGCCGCC 300 

GCAGCCTTCT CGGCCCGOGC CCCCGCCGCC TGCACCCCCA TCTGCTCTTC CCC60GGGQG 360 

C0GCGCGGC6 CGGGCTGGGG GCCCGGGCAO CCGCGCTOGG GCA6G6GGG6 OGGGGGGCTG 420 

C06CCTGCGC TCGCAGCTGG TGCCGGTGCQ 0GCGCTC66C CTGG6CCACC GCTCCGAOGA 480 

GCTGGTGOGT TTCCGCTTCT GCAGCGOCTC CT6CCGCCGC GCGCGCTCTC CACACGACCT 540 

CAGCCTGGCC AOOCTACTGG GOGCOGGGGC CCTGOGAOOS CCCCCGGGCT CCOSGOOOGT 600 

CAGCCASCOC TGCTGCCGAC CCA060GCTA CGAAGCGGTC TCCTTCATGG A0STCAACA6 660 



41 51 
I I 

LGSAPRSPAP RBGPPPVLAS 60 
RG6RAARAGG PGSRARAAGA 120 
BDIiSLASLLG AGALRPPP6S 180 
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CACCTGGAGA ACOSTGGACC 6CCTCTCCGC CACXS3CCTGC G6CTGCCTCG GCTGAGGGCT 720 

C3GCTCCAGGG CTTTGCAGAC TGGACCCTTA COSGTGGCTC rrCCrGCCrG GGACCCTCCC 7B0 

GCAOAGTCCC ACTAGCCAGC GGCCTCAGCC AGGGAOGAAG GCCTCAAA6C TGAGAOGCCC B40 

CTA00QGIG6 GXGAIGGAXA TCATCOCCGfc AOUSGTGAAG GGAChACTG^ CTAGGAGOGC 9Q0 

CAGAGCOCTC A0CCT6C0GA TOOCAGCCTA AAAGACACX^A GAGACCTCAO CTATG6AGCC 960 

CTTCGGACCC ACTTCTCACA GACTCTGGCA CTG6CCAGGC CTCGAACCTO GGACCCCTCC 1020 

TCTCATGAAC ACTACAGTGG CTGAGGCATC AGCCCCOGCC CAGGCCCTGT AGGGACASCA 1080 

TTTGAAGGAC ACATATTGCA GTTGCTTGGT TGAAAGTGCC TGTGCTGGAA CTGGCCTGTA 1140 
CTCACTGATO OGAOCTGGCC CC 

Seq ID N0« 2B1 Protein aequencei 
Protein Accession 8 s NP_476431.1 - 

1 11 21 31 41 51 

I 1 I I I I 

MELGLG6LST LSBCPVIPRRQ APliGLSAQPA LHPTLAALAL LSSVAEASLG SAPKSPAPKB 60 

GPPPVLASPA GHLPGGRTAR WCSGRARRPP PQPS31PAPPP PAPPSALPRO GRAARAGGPG 120 

SRARAAGft&G CRLRSQLVPV BAIiGIXSBRSD ELVRFRFCSG SCRRARSPHD LSLASX^IXSAG 180 
AIiRPPPGSRP VSQPCCRPTR YEAVSFMDVtf STHRTVORLS ATA0GCU3 

Seq ZD NO: 282 DMA sequence 

Nucleic Acid Accession 8i Eos sequence 

1 11 21 31 41 51 

I i I I I I 

CTACTGCACC TOCCCTCTOT TTCCTTTGGA AATCTCTTAC CTTTCATTAG GGTTTCTTTC 60 

ATAGCAATTT CCTTTGGTTT TTAAGACTTC TACATTGCTT TTTCTTTTAT TATCTGTOCT 120 

C0C3TGAACCT TATGAATGCT GCTTAAAAAT AATOTCAAAA TATOTTTTAG CTGOCTACTC 180 

AGGTAACGTT TTCTTTTGCT CTCATCTTGO TTTCCATATA CTATTTTTGG TTTTTTGTGA 240 

GATCTAATCA ATGATCTAGT CAGAAGCTAC TTCACTGGCT AACAGTGATC ATGTTCATGT 300 

6CTAAAAAT0 AACTT6AAAC ACGGAAGTA6 TG6TT6GTCC AGTTTGAAAO CTCTTATTAG 360 

TATTCTTCAT CCTGOCTOTA ATAATAOCCA TTATTTGTTA TQOCTTTGTT ATCSTAGCAGA 420 

CACTCTTAA© GATTTTATGT GTATTATTCA AATTGCIATT ACTGTTCTTT TTATAGTTGft 480 
GAATCTCAGG ATACXHTACAT TTATCACTTT TTCAATATAT AIGTATTTCT TATT 

Seq ID HO: 283 ONA sequence 

nucleic Acid Accession 8: Eos sequence 

Coding sequence: 564-1481 

1 11 21 31 41 51 

I I I I I i 

GASACTTTTA ATCATCTATC C C nXJ i XX r iT TAOSCAfiACC CTACAATACA CTAGAGGCTT 60 

CAAAGAtSGTC AAAAATTCAC ATGTGTAGAC AAATTAGGTC CCTTAAGATG CCAQGCAAAC 120 

GAAGTGCTAC CAAAACACGC AATGACTGTC CTAAAAGTGC GTTCTQGGAT ACACCTGTAA 180 

ACTTGGATCA AGTTCCCTCC CCTCTCCTCA AAATATATOS ACTTGTGCTG AAAC3AAATCA 240 

C38AOGCSATGC TCACAATTCT GACCTOSTAA TTAXATAOGa QGTGGTTTT6 6TTTCTG0GT 300 

CTTTCCCTGA TTCAGTGGCA GGTAACATAT TTCATGTACA AAATGAACTG CAACACCA06 360 

GCAAACAA6G GACAGGCCCT CAAAGTTGTC GGTAGGGAGC CAGGACCCCG CCPJ3TGGCGT 420 

GGGGAGACAC CGTACTAAAC AAGCTTGCAA ACAGCAGGC3V CCTTCCTGCC ACTGAGGAGG 480 

AAGGGCTG6C TAAGGGAGGC OGGGGOSGAQ GAAG0CAA8C TCTGCAGGCC CTGACAAAGT 540 

CCTCCGG6CC TCCA06CGTC GCCATQ6CAA 060QGGOTCT OTQCr GG CCQ GGATTGGCCG 600 

GCCT66CGCG 06CAGGGCCC 6CT6GGAAAG CX3CGTCCC06 COGOSGCTCC GCCAGTTTGA 660 

ACTTG6CG6G CCAGATGTGG GCGGCGGGGC GCT6GGGGCC TACTTrTCCC TCTTCCTACX3 720 

CCX3GTTTCTC TGCTGACTGC AGACCCASGT CTCXK5CCCTC CTOGGACTCC TGCTCAGTCC 780 

CTATGACG66 0GCACXSTG6G CAG6GGCTG6 AGGT6GT6C6 CTCGCCGTCXS COGCCSGCTGC 840 

CS3CTGAGCIG CAGCAATTGC ACCAG6TGGC TGTTGTCTCC OCTTQOCCAC CA6ASCTT0C 900 

AGTTTGACGA GGACGAGGGT GAOOGGGAOO ATGA6GAAGA 08T66ATGAT GA6GAAGACG 960 

TGGATGAAGA TGCCCATGAT TCAGA66CCA AAGTGGCGAO CXWSAGAGGA ATGGAGTTAC 1020 

AGG0GTGC6C CAGCACTCA6 GTTGAATCAG AAAATAACCA AGAAGAACAG AAACA66TGC 1080 

GCTTACCAGA AAGC06CCTG ACACCATG6G AGGT6TG6TT TATT6GCAAA GAAAAA6AAO 1140 

AAGGTGACCG GCTGCAACTG AAAGCTCTAG AOGAATTAAA TCAACAACTA 6AAAAAAGAA 1200 

AAGAAATG6A AGAAOSTGAA AAAAGAAAGA TAATTGCT6A AGAAAASCAC AAGGAATGGG 1260 

TTCAGAAAAA GAATGAGCAA AAAAQAAAAO AAAGAGAACA AAAAATTAAT AAAGAAATGO 1320 

AQGAAAAAGC AGCAAAGGAA CT6GAGAAAG AATACTTGCA AGAAAAAGCA AAAGAAAAAT 1380 

ATCAAGAATG GTTAAAGAAA AAAAAT6CTG AAGAATGTGA GAG6AA6AAG AAAGAAAA6A 1440 

AAAACAACAG CAAGCTGAAA TACA6GAGAA AAAGQAAATA GCAOAAAAAA AGTTTCAAGA 1500 

ATGGTTGGAA AATG06AAAC ATAAACCTOO TCCAGCT6CA AAGAGCTATG GTTATGCCAA 1560 

TGGAAAACTT ACAGQTTTTT ACAGTGGAAA TTCCTATOCA GAACCAGCCT TTTATAATCC 1620 

AATTOCGTGO AAACCAATTC ATATGCCACC TCCCAAAGAA GCTAAGGATC TATCAGGAAG 1680 

GAAGAGTAAA AGACCTGTGA TAAGTCAGCC ACACAAGTCA TCATCTCTGG TAATTCATAA 1740 

A6CCAGGAGC AATCTTTGCC TTGGAACTCT GTGCAOAATA CAAAGATA6C 6TATGTGGAA 1800 

AATAACATGC TTTTATCTGG AGCTATTTAA TTTAAAAATC AGAAATTGT7 TTTTACTGCT 1660 

CAGTCAATAA CTCAACACTT AATGTGATTA TTGACAAATA GCAATTTTTG CATTTGTATA 1920 

TGGAGTCCTT AGAGTTGAGO AAGATATTTT CT6GATTTT6 6TTTTTATAA ACTTTTTAAG 1980 

GTTGATCTTG GCATGTTQTT TTGCAGAATA AGTGGCT6AA TAT6TAA6AA TTGTGTTTCT 2040 
ATTTASCTT6 TATTAAAAGT ACACTGTAAT ACCAATAAAA CTAACAATTT TTCTT6 

Seq ID NO: 284 Protein sequence: 
Protein Accession ft: Eos sequence 

1 11 21 31 41 51 

I 1 I I I I 

MATRGIOfPG LA6LARAGPA GKARPSRGSA SLNLAGQMMA AGRWGPTFPS SYA6FSADC31 60 

PRSRPSSDSC SVPMT6ARG0 OIiEWRSPSP PLPLSCSNST SSLLSPL6BQ SFQFDEDDCa) 120 

GBDEEDVDOE E0VDEDABDS EAKVASLROl ELQGCASTQrV ESENNQBEQK QVRLPESRLT 180 

PWEVWFIGKB KEERDRLQLR ALEELNQQLE KRKSfEEREK EKIIAEEKHK EHVQKKNEQK 240 

RKEREQKINK EMEEKAAKEL EKEYLQEKAK BKYQEWLKKK HABECERKKR EKKNHSKLKY 300 
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Seq ID KO: 285 DMA sequence 
Nucleic Acid Accession S: Eos sequence 
5 Ooding sequence: 1-1746 

1 11 21 31 41 51 

I t i I I I 

ATGCCACTGA AGCATTATCT CCTTTTCXTrG GTGGGCTGCC AAGCCTGGGG TGCAGGGTTG 60 

10 GOCTACCATG GCTOCCCTAG CCSAOTGTACC TGCTCCACGG CCTCCCAGGT GGASTGCAOC 120 

GGQGCACGCA ■IWltWOGGT GCCCAOCCCT CTG O CCTGGA ACQCCAT6AG GCTGCAGATC 180 

CTCAACA06C ACATCACTGA ACTCAATGAG TCCOCGTTCC TCAATATCTC AGCCCTCATC 240 

GCCCTGAGGA TTGAGAAGAA TGAGCPGTCG CGCATCAOGC CTGGGGCCTT CCGAAAOCTG 300 

GGCTCGCTGC GCTATCTCAG CCTCGCCAAC AACAAGCTGC AGGTTCTGCC CATCX3GCCTC 360 

15 TTCCAGGGCC TGGACAGCCT TCSAGTCTCTC CTTCT6TCCA GTAACCAQCT GTT6CAGATC 420 

CAGCXX36COC ACTTCTCCCA GTGCAGCAAC CTC3kAGGAGC TGCAGTTGCA C9GGCAA0CAC 480 

CTGGAATACA TGCCTGAOG6 AGCCTTOGAC CAOCTGGTAG GACTCAGGAA GCTCAATCTG 540 

GGCAAGAATA GCCTCACCCA CATCTCACCC AGGGTCTTCC AGCACCTGGG CAATCTCCAG 600 

GTCCTCOGOC TGTATGAGAA CAGGCTCAOS GATATCCCCA TGGGCACTTT TGATGGGCTT 660 

20 GTTAACCTGC AGGAACTGGC TCTACAGCAO AACCAGATT6 GACTGCTCTC CCCTGGTCTC 720 

TTCCACAACA ACCACAACCT CCAGAGACTC TACCTGTCCA ACAACCACAT CTCOCAOCTQ 780 

CCACCCAGCA TCTTCATGCA GCTGCCCCAG CTCAACOGTC TTACTCTCTT TGGGAATTCC 840 

CTGAAGGAGC TCTCTCTGGG GATCTTOGGG CCCATGCCCA ACCTGOGGGA GCTTT6GCTC 900 

^ TATGACAACC ACATCTCTTC TCTACCCGAC AATGTCTTCA GCAACCTCOG CCAGTTGCAG 960 

25 GTCCTGATTC TTAGCCGCAA TCAGATCAGC TTCATCTCCC CGG6TGCCTT CAACGGGCTA 1020 

AOGGAGCTTC GGGAGCTGTC CXTTCCACACC AACX5CACTGC AGGACCTGGA CGGGAATGTC 1080 

TTCCGCATGT TG6CC3tflCCT OCAGAACATC TCCCTGCAGA ACAATOGCCT CAGACAGCTC 1140 

CCA6G8AATA TCTTOGCCAA OGTCAATGGC CTCATGGCCA TCOVGCTGCA GAACAACCAG 1200 

CTGOAGAACT TGCCCCTCGG CATCTTCGAT CACCTGGGGA AACTGTGTGA GCTGCGGCTG 1260 

30 TATGACAATC CCTGGAGGTG TGACTCAGAC ATCCTTCCGC TCC3GCAACTG GCTCCTGCTC 1320 

AACCAGCCTA GGTTAGGGAC GGACACTGTA CCTOTGTGTT TCAGCCCAGC CAATGTCOGA 1380 

GGCCAGTCCC TCATTATCAT CAATGTCAAC 6TTCCTGTTC CAAGOGTCCA TGTCCCTGAG 1440 

GTGOCTAGTT ACCCASAAAC AOCATGGTAC OCAGACACAC OCAGTTACCC IGACACCACA 1500 

TCOGTCTCTT CTACCACTGA GCTAACCAOC CCTOTGGAAG ACTACACTGA TCTGACTACC 1560 

35 ATTCAGGTCA CTGATGACCG CAGCGTTTGG GGCATGACCC AGGCCCAGAQ OGQ6CTGGCC 1620 

ATTGCCGCCA TTGTAATTGG CATTGTCGCC CTGGCCTGCT CCCTGGCTGC CTGOGTCGGC 1680 

TGTT G CTGCT GCAAGAAGAG GA6CCAA6CT GTCCTGATGC AGATGAAGGC ACCCAATGAG 1740 

TGTTAAAGAG GCAGGCTGGA GCA60GCTG0 GGAATQATGO GACTGGAGGA CCTGGGAATT 1800 

TCATCTTTCr 0CCTCC31CCC CTCGGTC C AT 6GA0CTTTCC OGTGATTGCT CTTTCTGGCC 1860 

40 CTAGATAAAO GTX3TX«XTAC CTCTTCCTGA CTTOCCTGAT TCTCCCOTAG AGAAGCAGGT 1920 

CGTGCOGGAC CTTCCTACAA TC3VGGAAGAT AGATCCAACT GGCCATGGCA AAAGCCCTGG 1980 

GGATTTCOGA TTCATACCCC TGGGCTTCCT TOSAGAGGGC TCTTCXTTCCA AATCCTCCCC 2040 

ACCTGTOCTC CAA8AACAGC CTTCCCTGOG OCCAGGCCCC CTG0GG6CCT CTGTAGACTC 2100 

AGTTAQTCCA CAGCCTOCTC ACTTC6TOG0 AATAGTTCTC OGCTGAGATA GCCCCTCTCO 2160 

45 CCTAAGTATT ATGTAAGTTG ATTTCCCTTC TTTTGTTTCT CTTGTTTGTG CTATGGCTTG 2220 

ACCCA6CATQ TCCCCTCAAA TGAAAGTTCT OCCCTTGATT TTCTGCTCCT GAAGGCAGGG 2280 

TGAGTTCTCT CCTCAAAGAA GACTTCAAAC CATTTAACTG GTTTCTTAAG AGCCGTCAAT 2340 

CAGCCIGGTr TTQ6GGAT6C TATGAAAOAO AGAAfiGAAAA TCATGCCGCT CAGTTCCTG6 2400 

A6ACAGAAGA 60C3GTCATC!A OTGTCTCACT T6TGATTTTT ATCTGGAAAA GGAAGAAACA 2460 

50 CCCCAGCkCfL GCAAGCTCAG CCTTTTAGAG AAGGATATTT CCAAACTGCA AACTTTGCTT 2520 

TGAAAAGTTT AGCCCTTTAA GGAATGAAAT CATGTAGAAT TTTGGACTTC TAAAAACATT 2580 

AAAATCAGCT TATTAATA06 GGATAGAGAA AGAAATCTGG TGCCTGGGGG TCCCTGTGTT 2640 

CACCCCTAGA GTTTGTTTTA AAATTTTTAA TTGAAGCATG TGAAGT6TAC STGCAGAAAA 2700 

OTGOGAACAT 6ATAGTGTAT GQCTTQGTGG ATTTTCACAA ACTGAACATA CCTGTGTAAI 2760 

55 CAGCATCTAG ACCCAGACXX: AOAGCATCAC AAATATCCCC CATCCTGGGC TTTTCCCAGA 2820 

GGAGATGGGG GCTTCTGAAQ ATOGACTTAC CTGGGACCTG CCCCCCATGA GCCAGGACGQ 2880 

TCCCXXCACA GTCAGCCTGT GCAAAGGCCC CGTGGCCAGG GGTG6AGGAG AATAT6TGG6 2940 

TQTG6ACAGG AIOOGAGACT GTGGCCTGAA CA6GAGATTT TATTATATCT G6A6ACCCT6 3000 

AGAGACCCTG AGA0CT6GG6 CACCATOOCT COCC3U3GTCA 6AA6CATCCT GACTGCAGAO 3060 

60 GTCOGTGCAG CCACACCCTC TTOCCTGCCA GCAAGTTGTC TGCX3GCTCAT CGGAGGCCCX: 3120 

TCOGCCTGGA GCX:TTCTATQ GAOGTGATAT GCCTGTATCT GTTTTTAATT TTCATTCTTC 3180 

ACTTAGGOGA A6TGAAATCG CTCAGAGAT6 AGATCCTTTA ATTGAAAACG AAGTGTAACG 3240 

OAATCTAOrO TCTTTCTAAT GTGGTAAAAT TCTCCATCAA CATCACAOTC AGCTOGCAGC 3300 

TGAACTTCAG AATCTCACTT ACAGCAGGCG ACAOGGGGOT ACAOOGATGG GTCACACTG6 3360 

65 GTCTGGGGGC TCOCTCGAGC TCCTCCTQOO TGTGGTCTGG TTAGGAGTTG AGTTGTTTGC 3420 

TCCAGGGTTA TTCTCCTCCT CGAGTCACAO TCACAOGAAT ACCTGCCTTC TCTGGCTTTC 3480 

CTGCTATACA CATATTCACA TGGCGCTCAA GAAGTTAGGC TCATGGCAAC GTGTGTCTTT 3540 

CTCTGGACAA CTGQCCCAGT TTACAGTGAA ATG6AGAATT TGAGGTCTOC AOGTCTGCOC 3600 

^ AGGAAAGAAC TTCA6CTGAC TCCA06OG6A TCTGGAAATC CAOGACCAAT 0C0GAT06GC 3660 

70 TCTTATTAGC TCCCCGCTCC ACAAGACACC TQTGCTTTGG AAATCCACCA CCAATCCOGA 3720 

TCGGCTCTTA TTAGCTCCCC GCTOCACAAG ACACCTGTQA TCTGGAAATC TACCACCAAT 3780 

OCCGATCGGC TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGACATCC TCCAGGOCCA 3840 

CAGGAGCAOO TGCTGACCAO ^ rm^X ' C^ ■ it.^ CAGTTCCTGC ACAAAAAOTO TCCAGAGGGC 3900 

TGTTTGCAAA CACTAGTGCA CTTTGTAOCT TTTCACCCTC TGTCCCAGQQ AATCTAGGAG 3960 

75 AGATGAGGCC OGTCAGAGTC AAGAGATGTC ATCCCCCCAG QGTCTCCAAG GCATTTCCAC 4020 

ACTATTGGTG GCACCTGGAG GACATGCACC AAGGCTTGCC AGAGCCAACA GGAA6TGAGC 4080 

CCAGAGCAT6 GCACATGAQC ATCACCOGCT GATGGTGGCC TGCTGTGCCT GGTGCCAACA 4140 

GGGGCATCCC OGCCCGTACC CCTC CA GACA G6AAGCATQ8 GTTT60CCAC AGACCrOTCS 4200 

GGTGCTCCTG TGAGTGGCCT CCAGAT6TCT TTGT6CATAG 6CACAAGTGG GCCAG66CTG 4260 

80 GAGGGAGGTG GGAAACCTCA TCATCCGGTG GGCCCTGCCA ATCTTAACCC A6AACCCTTA 4320 

GGTATTCCTG GCAGTAGCCA TGACATTG6A OCACCTTCCT CTCCAGCCAG AGGCTGACCT 4380 

GAGGGCCACT GTCCTCAGAT GACACCACCC AGGAGCACCC TAGGTGAGGG GTGAGGGCCC 4440 

CCTTATGTGA ACCTCTTGCC TCTTCCTTTC TCOCATCAGA GTGGTTGGAT GGAGCCATTG 4500 

GCCTCCmr CTTCAGCGGG CCCTTCAACC TCTCTGCACC ATGTTGTCTG GCTGAGGAGC 4560 

85 TACTAGAAAA GCT6AGTGGA GTCTCCTTTC CAACAGGATG ATGCATTTGC TCAATTCTCA 4620 

OGGCTGGAAT GAGCCGGCTG GTCCCCCAQA AAGCTGGAGT GGGGTACAGA GTTCAGTTTT 4680 

CCi'Cl^'i'Crr TACAGCTCCT TGACAGTCOC AOGCCCATCT GGAGTGGGAG CTGGGAGTTA 4740 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



GTGTT66A6A 
TGCACAGATA 
OGTAGGAGTG 
GGTGTTCAAT 
TTGTCTTGGG 
TTAOTCTTGG 
GGAAAAAATA 
TGGGCTGTAT 
AACTTTTCAT 
GAACTTCCAA 
AGTrOGTOQA 
GCOCCCAOAT 
OGAAGGAAGC 
CTCCTTCCGC 
CTTCATGCTG 
QCCCCAGTQC 
AOCCCTGOTQ 
0GTGTACA6A 



AGAAACAACA 
CTCTTCAAGC 
CXXXTTCTAC 
AGGCTGGGAO 
CTTTC5GTCAT 
TCAT CftGAAC 
AACTCTTCCA 
GTATATTGTT 
GGACACAATT 
ACTCAGGAAO 
CAGAIGTTAG 
CCCACAGTCA 
CATGGCTGTG 
CCCAGGTTTC 
CCTTCAAAGC 
TTGGCQATGC 
GGCAGGGTTG 
ATCAACAATA 



AAA6CCAATT 
ACTX3GA06TG 
CCACTTGTGA 
TTTTATTTAT 
TAAACCAAAG 
CTCACTT6GT 
TCOCTTAAAG 

TCCACAACCT 
TTTGCAGAGA 
ATSTATOCTA 
GAACIGAATC 

GTTCAGAGAG 
TTCTTCTCTT 
TAGATCATGT 
ATTTACAGAT 

AATAATATAC 



AGAAOCACTA 
GATTCTCTCT 
T6G6GTACAG 
CTCTTCAAAC 
GAAATGGAAG 
AOCATATAGA 
AATAGAATAG 
AGAATTTA6A 
TTCAGAT6CT 
6CAGACA6CT 
6CTTTTA6CC 
TGOGTTGTTG 
GGTGGGCTGG 
AAGGAGAGAT 
TTGCCTTGCT 
TTCTAGGCCC 
CTTCTGCTGG 
ATGTAT 



TTTT TAAftAA 
CTAGCCCTCA 
AGGCACTTGC 
TTTGTACAAG 
CCATTCCOCT 
TCAAAAGCTT 
TTTGTCOCrC 
GATACAAGAG 
GATGTAGA6C 
AGA6ATAACT 
ATAAAOCACr 
GGAAGOCAGC 
CAAGCCACTT 
TGTTCTCACC 
TAGAGAATTA 
TCAGGGTTTT 
AT6CTGCTT6 



Seq ID KOt 286 Protein sequence: 
Protein Accession #: HP 570843.1 



X 
I 

MPLKHYLLLL 
LMTHITELNB 
FQ6LDSLESL 
GKNSLTHISP 



YDNBISSLPD 
FRMLAKLQHI 
YDNPWRCDSD 
VPSYPBTPHY 
ZAAIVIOIVA 



11 
I 

VGCQAWGAGL 
SPPUYISALI 
LLSSNQIiLQZ 
RVFQHLGtrLQ 
YLSEIMHISQL 
NVPSNLRQLQ 
8LQHNRLRQL 
ILPLRimLLL 
PDTPSYPDTT 
LACSLAACVG 



21 

1 

AYHGCPSECr 
ALRIEKNELS 
QPAHPSQCSN 
VLRLYENSLT 
PPSIFMQliPQ 
VLILSRHQIS 
PGNIFANVNG 
NQPRLGTDTV 
SVSSTTEL7S 
CGCCKKRSQA 



31 

i 

CSRASQVECT 
RITPGAFRKL 
LKELQIiHGNH 
DIPHGTPDGL 
UIRLTLFGNS 
FISPGAFN6L 

FVCFSPANVR 
PVEDYTDLTT 
VLMQMECAPHE 



41 

1 

QARIVAVPTP 
GSLRYLSLAN 
LEYIPDGAFD 
VNLQSLALQQ 
IiKELSIiGIFQ 
TEIjRBLSIiHT 
LENLPLGIFD 
6QSLIIZHVN 
IQVTDDRSVH 
C 



Seq ID KOt 267 DNA sequence 
Nucleic Acid Accession 8: HM_002362 
Coding sequence t 1 . . 954 



ATGTCTTCT6 
GA660CCTG0 
TOCTCCTOCT 
GGTCCTCCCC 
TG6AGGCAAC 
6A06CAGAGT 
CTGCTCCX3CA 
ATCAAAAATT 
ATGATCTTTG 
ACCTGCCTGG 
GGCCTTCTGA 
GAAATCTGGG 
GG6GA60CCA 
CAG6TACC06 
GAAACCAGCT 
GCCTACCX»T 



11 
I 

AGCAGAAGAG 
GCCTGGT6GQ 
CTCCTCT6GT 
AGA6TCCTCA 
CCAATGAGGG 
CCTTGTTCCG 
AGTAT08A6C 
ACAAG06CTG 
GCATTGAOGT 
GCCTTTCCTA 
TAATCGTCCT 
AGQA8CTGQG 
G6AAACTGCT 
GCAGTAATCC 
ATGTGAAAGT 
CCCTGCGTGA 



21 
1 

TCAGCAC3X3C 
TOCACAGGCT 
COCTGGCACC 
GGGAGCCTCT 
TTCCAGCAGC 
AGAAGCACTC 
CAAGGAQCTG 
CTTTCCT6TG 
GAAGGAAGTG 
TGATGGCCTG 
66GCACAATT 
T8TGAT6GG0 
CACCCAA6AT 
TGOGC6CTAT 
CCTGGAGCAT 
AGCAGCTTTG 



31 
I 

AA0CCTGAG6 
GCTACTACTG 
CTGQAGGAAG 
GCCTTACCCA 
CAAGAAGAGG 
AGTAACAAGO 
GTCACAAAG6 
ATCTTOGGCA 
GACCCOGCCA 
CTGGGTAATA 
GCAATGGAGG 
GTGTATGAT6 
TGGGTGCAGG 
GAGTTCCT6T 
GTGGTCAGGG 
TTAGAGGAG6 



41 
I 

AAGGOGTTGA 
A6GAGCAGGA 
TGCCTGCT6C 
CTACCATCAG 
AGG6GCCAAG 
TGGATGAGTT 
CMSAAAT6CT 
AAGCCTC08A 
GCAACACCTA 
ATCAQATCTT 
G06ACAGCGC 
GGAOGGAGCA 
AAAACTACCT 
GGGGTCCAAG 
TCAATGCAAG 
AAGAGGGAGT 



GTGCTTACTS 
GCAC00CT6C 
TCTTCTGCAT 
AGCTCATGGC 
GTTGCTCTGC 
T67AA0CACA 
TCATGGQAAT 
TTCTACTTAG 
TATTGGGAAA 
0G6GACCCAG 
CAAAGATTGA 
AGTG6CCTT6 
CCGGGGAAAA 
AACCCGCT6C 
CTGCAAATCA 
GTAGAGTGTG 
TAATCCATTT 



51 
I 

ZiPWNAMSLQI 
NKLQVIiPIGL 
HLVGLTKU7L 
NQI6LLSFGL 
PMPNLRELHL 
NALQDIDGNV 
HLGKLCELRI) 
VAVPSVHVPE 
(34TQAQSGLA 



51 
I 

GGCCCAAGAA 
GGCItjCTGTC 
TGAGTCAGCA 
CTTCACTTGC 
CACCTOGCCT 
GGCTCATTTT 
GGA6AGAGTC 
6TCCCTGAA0 
CACCCTTGTC 
TCCCAAGACA 
CTCTGAGGAG 
CACTGTCTAT 
G6A0TAC0GQ 
GGCTCTGGCT 
AGTT08CATT 
CTGA 



Seq ID NO: 288 Protein sequence: 
Protein Accession 6: HP 002353.1 



1 11 21 31 41 51 

! I I I I I 

MSSBQKSQBC KPEB6VEAQE EAL6LVGAQA PTTEBQEAAV SSSSPLVP6T LSEVPAAESA 
6PPQSPQGAS ALPTTISFTC HRQPNB6SSS QEEEGPSTSP DAESLPREAL SNKVDELAHP 
LLRKYRAKEL VTKAEMLERV IKNYKRCPPV IFGKASESIiK MIFGIDVKEV DPASJfTYTLV 
TCLGLSYDGL LQIMQIFPKT GLLXIVLGTI AMBGDSASEB BIHEELGVM8 VYDGREHTVY 
GEPRKLLTQD HVQQm*EyR QVPGSNPARY EFLH6PRALA STSYVKVLBB WRVKABVRI 
AYPSLREAAL LEEEEGV 

Seq ID NO: 289 DNA sequence 
Nucleic Acid Accession #: NM_002362 
coding sequence: 46.. 1344 



11 



06GCGGC0GC 
GGCGACCTGA 
CATCAGC6CG 
CTCAACAGAC 
TTGAOCAGAA 
CAGCCCATCG 
G6CCCCAGCA 
GTTCTACCTG 
AAATOCCATC 



GCCCTGGTTG 
AGCAGGCGCT 
GCAGCAGCAC 
ATAATATTGT 
ATGTGCAGTC 
ATTTGAGTGC 
GT6AAAATCT 
GAGCTQAATT 
TGCTC6ATTA 



21 
I 

GGTCCCCACT 
TCCCTGTGTG 
TGCAAAGAAA 
GTTTG6TGAT 
TGTGTCTATT 
ATGCACTGTT 
GQAGGAAGAG 
CCATGGGCTT 
TGT6ATGACA 



31 
I 

GCTCTOGGGG 
GCCGAGTCGC 
QAAGACATAA 
TACACATG6A 
ATTGACACAG 
GCACTTCACA 
ACAGAAAACA 
TGGGACAGCT 
ACTTTACTST 



41 

I 

G06CCATG6A 
CAACGGTCCA 
ACCTGAGTGT 
CTGAGTTTGA 
AATTAAAGGT 
TTTTCCAGCT 
TAATT6CAGC 
TCGTATACGA 
TTTCAGACAA 



51 

I ' 

CGAG6CCGTG 
CGTGGAGGTG 
TAGAAAGCTA 
TGAACCTTTT 
TAAAGACTCA 
GAATXSAAGAT 
AAATCACTGG 
TGTGGAAGTC 
GAAOGTCAAC 



4B00 
4660 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5560 
5640 
5700 
5760 



60 
120 
180 
240 
300 
360 
420 
480 
540 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
760 
840 
900 



60 
120 
180 
240 
300 



60 
120 
160 
240 
300 
360 
420 
480 
540 



294 



wo 02/086443 

AGCAAOCTCA TCACCTGGAA OCGGGTGGIG CTGCTGCAOG CTCCTOCTGO CACIGGAAAA 600 

AC M OOCTgr GTAMSOGTT A6CX3CAGAAA TrQACMOTfi GACTTTCSMG CAG6TAC0SA 660 

TATCGCCAAT TAATTGAAAT AAACAGCCAC AGCCTCTTTT CTAAGTGGTT TTCGGAAAGT 720 

GGCAAGCTGG TAACCAAGAT GTTTCAGAAG ATTCAGGATT TGATTGATGA TAAAGACGCC 780 

CTGGTGTTCG TGCTGATTGA TGAGGTGGAG AGTCTCACAG CCGCCCGAAA TGCCT6CAG6 840 

G06G6CAC06 AGCCATCAGA T6CCATC06C GTGGTCAATG CTGTCTTGAC CCAAATTGAT 900 

CAGATTAAAA GGCATTCCAA TCTIGTGATT CTGACCACTT CTAACATCAC O3A6AA0ATC 960 

GACGTGGCCT TCCTGGACAG QGCTGACATC AAGCAGTACA TTCGGCCACC CTCTGCAGCA 1020 

GCCATCTTCA AAATCTACCT CTCTTGTTTQ GAAGAACTGA TGAAGTGTCA GATCATATAC 1080 

CXrrOGCCAGC AGCTGCTGAC CCTCCGAGAG CTAGAGATGA rrGGCTTCAT TGAAAACAAC 1140 

GTGTCAAAAT TGA6CCTTCT TTTQAATGAC ATTTCAAGGA AGAGGBAGGG CCTCAG0Q6C 1200 

OOGQTOCr g A GAAAACTCCC CTrfCiX SG Cr CAT606CT6T ATGTOCAGGC CCCCACOCTC 1260 

ACCATAGAGG GGTTCCTCCA GGCCCTGTCT CTGGCA6TGG ACAAGCAGTT TGAAGA6AGA 1320 

AAGAAGCTTG CAGCTTACAT CTGATCCTGQ GCTTCCCCAT CTGGTGCTTT TCCCATGGAG 1380 

AACACACAAC CAGTAAGTGA GGTTGCCCCA CACAGCOGTC TO0CAGG6AA TCCCTTCT6C 1440 

AAACCAAAOG TTACTEAGAC T6CAAGCTAG AAAGCCACCA AGGCCAGGCT TTGTTAAAAO 1500 

AAOTGTATTC TATTTATGTT GTTTTAAAAT GCATACTGAG AGACAAACAT CTTGTC A TTT 1560 

TCaCTQTTTG TAAAAGATAA TTCAGATTGT TTGTCTCCTT GTGAAGAACC ATCGAAACCT 1620 

GTTTGTTCCC AGCCCACCCC CAGTGGATGG GATGCATAAT GCCAGCAAGT TTTGTTTAAC 1680 

AGCAAAAAAG GAAGATTAAT GCAG6TGTTA TAGAAGCCA6 AAGAGAAACT GTGTCACCCT 1740 

AAAGAAGCAT ATAATCATAQ CATTAAAAAT GCACACATTA CTCGAG6TG6 AAGGTGQCAA 1800 

' iT G CIT T C IG ATATCAGCTC GTTTSATTTA GTGCAAAAAT GITTTCAAGA CTATTTAATG 1860 

GATGTAAAAA AGCCTATTTC TACATTATAC CAACTGAGAA AAAAATGGTC GGTAAAGTGT 1920 

TCTTTCATAA TAAATAATCA AGACATGGTC CCATTTGCAG GAAAAGTGCA GACTCTGAGT 1980 

GTTCCAGGGA AACACATGCT GGACATCCCT TCTAACCCGG TATGQGCGCC CCTGCATTGC 2040 

TGGGATGTTT CTGCCCAOGG rmt^i ' T i O l' 0CAATAAC3ST TATCACATTT CTAAT GAGGA 2100 

TTCACATTAA TATAATATAA AATAAATAGG TCAGTTACT6 GTCTCTTTCT 6CGGAATGTT 2160 
ATGTTTTGCT TTTATCTCAC AGTAAAATAA ATATAATTAA AAA 

Seq ZD NO: 290 Protein sequence: 
Protein Accession it ]9P_004228 

1 11 21 31 41 51 

I 1 I I 1 I 

MDEAVGDLKQ ALPCVAESPT VHVEVHQRGS STAKKEDIKL SVRKLLNRHN IVFGDYTWTB 60 

PDEPPLTRNV QSVSIXDTEL KVKDSQPIDL SACTVALHIP QLNBDGPSSB NLEEETENII 120 

AANHWVLPAA EFHGLWDSLV YDVBVKSHLL DYVMTTLLFS DKNVNSNLIT WKRWLLHGP 180 

PGTGRTSLCK ALAQKLTIRL SSRVRYGQLI BINSKSLPSK WF8ESGXLVT KMFQKIQDLI 240 

DDKDALVFVL IDBVBSLTAA RNACRAGTEP SDAIRWNAV LTQIDQIKRH SNWILTTSU 300 

ITEKIDVAFV DRADIKQYZG PPSAAAZFKZ YLSCLEEU1K OQXZypRQQL LTLRELEMIG 360 

FIEI3NVSKLS LLLNDISRXCS EGLS(BCVIiRX LPFLABAIiYV QAPTVTIEGF LQALSLAVDK 420 
QFEERKKLAA YX 

Seq ID NO: 291 Wk. sequence 

Nucleic Acid Accession #: NM_002658.1 

Coding sequence 1 77-1372 

1 11 21 31 41 51 

I I I I I t 

GTCCCXX3CAG CGCCGTCGCG CXXTTCCTGCC GCAGGCCACC QAGQCCGCCG COGTCTAGCG 60 

CCCCGACCTC GCCACCATGA GA6CCCTGCT GGCGCX3CCTG CTTCTCTGCG TCCTGGTOGT 120 

GA60GACTCC AAAG6CAGCA AT6AACTTCA TCAAGTTOCA TCGAACTGTG ACT6TCTAAA 180 

TGGAGGAACA T6T0T8TCCA ACAAGTACTT CTOCAACATT CACTGGTGCA ACT6CCCAAA 240 

GAAATTGQSA GGGCAGCACT GT6AAATAGA TAAGTCAAAA ACCT6CTATG AGGGGAATOO 300 

TCACTTTTAC CGAGGAAAGG CCAGCACTGA CACCATGGGC CGGCCCTGCC TGCCCTOGAA 360 

CTCT6CCACT GTCCTTCAGC AAACGTACCA TGCXXIACAGA TCTGATGCTC TTCAGCTGGG 420 

CCTGGGGAAA CATAATTACT 6CA0GAACCC AGACAACOGG AGGOGACXXTT GGTGCTATGT 480 

6CAGGT0GGC CTAAA6CC6C TTGTCCAAGA OTOCATQOTO CATGACTGOG CAGATGGAAA 540 

AAAGCCCTCC TCTCCTCCAG AAGAATTAAA ATTTCAGTGT GGCCAAAAGA CTCTGAGGCC 600 

COGCTTTAAO ATTATTGGGO GAGAATTCAC CACCATCGAG AACCAGCCCT QGTTTOOGGC 660 

CATCTAC3U3G AGGCACOGGC G6GGCTCTGT CACCTAOntS TGTGGAGGCA GCCTCATCAG 720 

CCCTTOCT6G GT6ATCAGCX3 CX^ACACACTG CTTCATTGAT TACCCAAAGA AGGAGGACTA 780 

CATOGTCTAC CTGGGTGGCT CAAGGCTTAA CTCCAACAOG CAAGGG6AGA TGAAGTTT6A 840 

GGTGGAAAAC CTCATCCTAC ACAAGGACTA CAGCGCTGAC AOGCTTGCTC ACCACAAGGA 900 

CATTGCCTTG CTQAAGATCC GTTCCAAGGA GGGCAGGTGT GCGCAGCCAT CCCGOACTAT 960 

ACAGACCATC TGCCTGCCCT CGATGTATAA CGATCCCCAG TTTGQCACAA GCTGTGAGAT 1020 

CACTG6CTTT GGAAAAGAGA ATTCTACCGA CTATCTCTAT CCGGAGCAGC TGAAAATGAC 1080 

TGTTGTGAA6 CTGATTTCCC ACOQGGAGZ6 TCAOCAGCCC CACTACTAOG GCTCTGAAGT 1140 

CACCACCAAA ATGCIATGTG CTGCTGACCC CCAATG6AAA ACAGATTCCT GCCAGGGAGA 1200 

CTCA6QGGGA CCCXTTCGTCT GTTCCCTCCA AGGCCGCATG ACTTTGACTG GAATT G TGAG 1260 

CTOGQQCOGT OGATGTGCCC TGAAGGACAA GCCAGGCGTC TACAOGAGAG TCTCACACrr 1320 

CTTACCCTGG ATCOGCAGTC ACACCAAGGA AGAGAATGGC CTGGCCCTCT GAG6GTCCCC 1380 

AG6GA0GAAA G6G6CACCAC CUUCm ' CiT GCT6GTTGTC ATTTTTGCA6 TA6AGTCATC 1440 

TCCATCAGCr GTAAGAAGAG ACTGGGAAGA TAGGCTCTGC ACAGAT6GAT TTGCCTGTGG 1500 

CaCCACCAOG GTGAA06ACA ATA6CTTTAC CCTaU3GGAT AGGCCTGGGT GCTG6CTGCC 1560 

C3VGACCCTCT GGCCAGGAT6 QAGGGGTGGT CCTGACTCAA CATGTTACTG AOCAOCAACT 1620 

TG l ' C Tl' iTf C lOGACTGAAG CCTGCAGGAG TTAAAAAGGG GAGGOCATCT OCTGTGCAT6 1680 

GGCTOGAAGG GA6AGCCAGC TCCCOOQACC GGTGGGCATT TGTGAGQOCC ATGGTTGA6A 1740 

AAT6AATAAT TTCCCAATTA G6AAGT6TAA GCAGCT6AG6 TCTCTTGAG6 6A6CTTAGCC 1600 

AATGTGGGAG CAGOGGTTTQ GGGAGCAGAG ACAC7AA0GA CTTCAGGGCA GGGCTCTGAT 1860 

ATTCC3VTGAA TGTATCAGGA AATATATATG TGTGTGTATG TTTGCACACT TGTTGTGTGQ 1920 

GCrGTQAGTG TAA6TGTGAG TAAGA6CTGG TGTCTGATTG TTAAGTCTAA ATATTTCCTT 1980 

AAACTGTGIG GACT6T8ATG OCACACAGAG TGGTCTTTCT GGAGAG6TIA TAG6TCACTC 2040 

CTGGGGCCTC TTGGOTCCCC CAOGTGACAO T6CCTO0QAA TGTACTTATT CTOCAGCATO 2100 

ACCTGTQVCC AGCACTGTCT CAGTTTCACT TTCACATAGA TGTCCCTTTC TTGGCCAGTT 2160 

ATCCCTTOCT TTTAGCCTAG TTCATCCAAT CCTCACTGGO TGGGGTGAGG ACCACTCCTT 2220 

ACACTGAATA TTTATATTTC ACTATTTTTA TTTATATTTT TGTAATTTTA AATAAAAGTG 2280 



29S 



wo 02/086443 

ATCAATAAAA TGTGATTTTT CTGA 



Seq ID KO: 292 Protein sequence i 
Protein Accession SiHP_002649.1 

X 11 21 31 41 51 

I I I I I I 

MRALLARIiLL CVLWSDSKG SNELHQVPSH CDCLNGGTCV SNKYPSNIEW OJCPKKPGGQ 60 
HCBIDRSKTC YBGNGHFYRG KASTDTMGRP GLPWNSAIVL QQTYHAHRSO ALQIXSLGKHN 120 
YCSNFDNRRR PMCyVQVGZJC PLVQBCMVHD CADGKKPSSP PEELXFQOGQ KTLRPRFKIZ 180 
GGEFTTIENQ PHFAAIYRRH BGGSVTYVOG GSXiZSPCNVZ SATHCFIDYP KKEDYZVYLG 240 
RSRUrSKTQG EMKFBVEHLI I.HXDY6ADTL AHHKDZALUC IRSKE6RCAQ PSRT IQTIC L 300 
PSMYNDPQFG TSCEIT6FGK EI7STDYIiYFE QLKNTWXZiZ SHREOQQPKY YG SEVT TKML 360 
CAADPQWKTD SCQGDSGGPKi VCSX^QGRKTIi T&IVSW(3t6C ALECDKPGVYT RVSHFLFUZR 420 
SHTKEENGLA L 

Seq ZD NO: 293 DHA sequence 
nucleic Acid Accession §i NM_001498 
Coding sequence: 93.. 2006 



1 11 21 

I I t 

06CA0SAGGC TGAGTGTCOG TCTOGOGCCC 
AG6AG6A6GA GGAGGAGGAG GAGGGGGC6G 
TGAGCTGGGA GGAAACCAAG CGGCATGCCG 
TCCK3CACAT CTA0CACX3CC GTCAAGGACC 
AGQTG8AATA CATGTTGGTA TCTTTT6ATC 
CTGGOGAGAA AGTTCrTGAA ACTCTGCAAG 
CTACCCTTTG GAGACCAGAG TATOSGAGTT 
ACGGAGGAAC AATGTCCXSA6 TTCAATACA6 
AGGCTACTTC TATATTAGAA GAAAATCA6G 
TAGGCIGTOC TQQGTTCACA CTG000GA86 
CCAAGTGOCr CTTCTTTCCA GATGAAGCAA 
CAAGAAATAT CCGIVCATAGG AGA6GAGAAA 
ACAAGAATAC ACCATCTCCA TTTATAGAAA 
CTTCTAAGCC GGATCATATT TACATGGATG 
TCCAGGTGAC ATTGCAAGCC TGCAGTATAT 
CTACTATCT6 TCCAATTGTT ATGGCTTTGA 
TGTCAGACAT TGATTGTCGC TGGGGAGT6A 
AGGAGGGAG6 ACTGGAGCCA TTGAAGAACA 
ACTCAATAGA CAGCTATTTA TCTAAGTGTO 
TAGATAAAGA GATCTAOGAA CAGCTGTTGC 
ATOTTOCTCA TCTCTTTATT AGAGACCCAC 
ATGATGCTAA TGAGTCTGAC CATTTTGAGA 
GATTTAAGCC CCCTCCTCCA AACTCAGACA 
AGGTGCAATT AACAGACTTT GAGAACTCT G 
QAGTGATOCT TTOCTACAAA TTOGATTTTC 
TGAAGQTAGC ACAGAAAAGA GATGCTGTCT 
TTTGCAAAGG T6GCAATGCA GTGGTGGATG 
TCGCTGCAGA GGAGTACACC CTCATGAGCA 
TGTTTCCTGG ACTGATCCCA ATTCTGAACT 
ACACCAGATG TAGTATTCTO AACTACCTAA 
TAATGACA6T TGCCAGATGQ AT6A6G6AGT 
ACAGTGTCAT AACTGATGAA ATGAATTATA 
ATQAATTATG TGAATGCCCA GAGTTACTTG 
GAAGTAAAAC TGACTCATCC AACTA6ACAT 
ACTGGCTACA GTACCATOCC TCTCAGCOOQ 
CTGTACTGTT TTCTGG6CCA GTGAGCCAGA 
TCTAGAGTTT ATACAGTOTA CATGTACATA 
AATAACATAT CTAAAGTCAT CATGAACTGG 
CAACCTACT6 TCTAAGCAGT TTT6TAAATQ 
GAGTTAAAAT GTTTACTGTA AATTTnCTT 
TGAAATTTTT CTCTTTAAAA ACATTTTCTC 
TGTCTACATT AAATCACTTG AATCCATTQA 
GCACCTTATC TATGATGTTT CTTTTGCAAT 
GCTTTCCCCT CTGAATAAAT ACCCATT6AA 



31 41 51 

I I I 

GGAAG06GGC GACCGCCGTC AGCCCGGAGG 60 

CCATGGGGCT GCTGTCCCAG GGCTCGCCGC 120 

ACCAOGTGCG GCGGCAOGGG ATOCTCCAOT 180 

GGCACAAGGA CGTTCTCAAG TGGG606ATO 240 

ATGAAAATAA AAAAGTCOGG TTGGTCCTGT 300 

AGAAGGGGGA AAGGACAAAC GCAAACCATC 360 

ACATGATTGA AGGGACACCA GGACAGCCXTT 420 

TTGAGGCCAA CATGCGAAAA CGCCGGAAGG 480 

CTCTTTGCAC AATAACTTCA TTTCCCAQAT 540 

TCAAACXXAA CCCAGZGGAA GGAGGAGC1T 600 

TAAACAA6CA CCCTC6CTTC AGTACCTTAA 660 

AGGTTGTCAT CAATGTACCA ATATTTAAGG 720 

CATTTACTGA GGAT6ATGAA GCTTCAAGGG 780 

CCATGGGATT T66AATGGGC AATTGCTGTC 840 

CrrGAGGCCAG ATAGCTT7AT GATCASTTGG 900 

GTGCTGCATC TCCCTTTTAC 06AG6CTATS 960 

TTTCTGCATC T0TAGAT6AT AGAACTOGOG 1020 

ATAACTATAG GATCAGTAAA TCCX:GATATG 1080 

GTGAGAAATA TAATGACATC GACTTGAOGA 1140 

AGGAAGGCAT 7GATCATCTC CTGGCCCAGC 1200 

T6ACACTGTT 7GAAGAGAAA ATACAOCTGG 1260 

ATATTCAGTC CACAAATTG6 CAGACAATGA 1320 

TTGGATGGAG AGTAGAATTT CGACCCATGG 1380 

CXrTATGTGGT GTTTGTGGTA CTGCTCACCA 1440 

TCATTGCACT 6TCAAAQGTT GAT6AQAACA 1500 

T6CAGGGAAT GTTTTATTTC AGGAAAOATA 1560 

GTTGTGGCAA GGCCX»GAAC AGCAOGGAGC 1620 

TAGACACCAT CATCAATGGG AAGGAAGGTG 1680 

CTTACCTT6A AAACATGGAA GTGGATGTGG 1740 

AOCTAATTAA GAAGAGAGCA TCTGGAGAAC 1800 

TTATOGCAAA CXJ^TCCTGAC TACAAGCAAG 1860 

GCCTTATTTT GAAGTGTAAC CAAATTGCAA 1920 

GATCAGCATT TAGQAAAGTA AAATATAGTG 1980 

TCTACAGAAA GAAAAATGCA TTATTGACGA 2040 

T6TGTATAAT AT6AAGACCA AATGATAGAA 2100 

AATTGATTAA GGCTTTCTTT GGTAGGTAAA 2160 

GTAAAGTATT TTTGATTAAC AATQTATTTT 2220 

CTTGTACATT TTTAAATTCT TACTCTGQAO 2280 

TACTGGTAAT TGTACAATAC TTGCATTCCA 2340 

CTTTTAAAOA CTACCTGGGA CCTGATTTAT 2400 

TOGTTAATTT TCCTTT G TCA ' mtXTiTtfl ' 2460 

AAGTGCTTCA AGGGTAATCT TGGGTTTCTA 2520 

TGGAATAATC ACTTGGTCAC CTTGCOCCAA 2580 
CTCTGAAAAA AAAAAAAAAA AAAA 



Seq ZD KOt 294 Protein sequences 
Protein Accession ftt NP_001489 



1 11 21 31 41 51 

i I I I I 1 

KGLLSQGSPL SWEETKRHAD HVRRHGILQP LHIYHAVKDR HKDVLKKGDB VBYMLVSFDH 60 

EMKKVRLVLS GEBWLBTLQE KGERTNPNHP TLWRPEYGSY MIBQTFGQPY GGTMSEPKTV 120 

EANMRKRRKE ATSZLBENQA LCTITSFPRL GCPGFTLPEV KEHPVEGGAS KSLFFPDBAZ 180 

HKHPRPSTLT RNIRBRRGEK WZNVPZFKD RI3TPSPPZET FTEDDEASRA SKPDHZYMDA 240 

MGFOCGNCCIj QVTPQACSZS EARYLYDQLA TICTIVMALS AASPFYRGYV SDZDCRWGVZ 3O0 

SASVDDRTRE ERGLEPIiKNN NYRZSKSRYD SIDSYLSKOG EKVNDZDLTl DKEZYEQLLQ 360 

SGZDHLIiAQH VAELFIRSPL TLFEEKZBLD DANESDHFEN ZQSTKWQTMR FKPPPPKSDZ 420 

GWRVEPRPMS VQLTDFENSA YWFWLLTR VZLSYKUSFXi ZPLSKVDEKM KVAQKRDAVL 480 

QGMFYFRKDZ CXGGNAWDG GGKAQNSTEL AAEETTZMSZ DTZZK6KS6V FPGLZPZLNS 540 

YLSmBVDVD TRCSZUIYIiK LZKKRASGBL MmRHMREF ZAKHPDYKOD SVZTDEMHYS 600 
LZLKOIQZAN BLCBCPBLLG SAPRKVKSrSO SKTDSSH 



296 



wo 02/086443 PCTAJS02/12476 
Seq 10 SOt 295 DNA sequence 
Nucleic Acid Accession #: sos sequence 
Coding sequence: 247-816 

Si 11 21 31 41 51 

I I I I I I 

AGTGTTGG6C TGGGGCAfiOC A06CTGT06C TG6CTACTTC C Cn XXUVCt; ATC0CCCTT6 60 

GGCCAAAOGG GAT0G8TGCT TCTGGTGAGA CGOCTCCCCA TGCACATCAC TCCCAGGTGC 120 

CCTAGGGGGC ACATTTCCCA CAACTCCCAG AGGGCAGGTT TCTAGAAAGT GCCACXAGTG 180 

lU G6GAGG06CC ACAACTTCAC TGCCATTTTG TGAGGTGCXS CCXSTCTCTCC TCCA6CAAGG 240 

GAAACAATGA OOGATAAAAC AGAGAAGCrG GCIGTAGATC CXGAAACTGT GTTTAAAaST 300 

CCCAGGGAAT GTGACAC3TCC TTGGTATCA6 AAAA66CAGA GGATG6CCCT GTT66CAAG6 360 

AAACAAGGAO CAGGAGACAG CCTTATTGCA GGCTCTGCCA TGTCCAAA6A AAAGAAGCTT 420 

ATGACAGGAC ATGCTATTCC ACCCAGCCAA TTGGATTCTC AGATTGATGA CTTCACnXXTT 480 

TTCAGCAAAG ATAGGATGAT GCAGAAACCT GGTA6CAATG CACCTCnGGG AOGAAAOGTT 540 

ACXAGCAGTT TCTCTGGAGA TQACCTAGAA TGCAGAGAAA CAGCXTTOCTC TGCCAAAAGC 600 

CAAOCSAGAAA TTAATGCTGA TATAAAAOGT AAATTA6TGA AGGAACTCOG AT60GTT6GA 660 

CAAAAATATQ AAAAAATCTT CGAAATGCTT GAAGGAGTGC AAGGACCTAC TGCAGTCAGG 720 

AAGCGATTTT TTGAATCCAT CATCAAGGAA GCAGCAAGAT GTATGAGACG AGACTTTGIT 780 

AAGCACCTTA AGAAGAAACT GAAACGTATG ATTTGAGAAT ACrTGTCCCT GGAGGATTAT 840 

CACACCCCAA ATGCATAATC TCGTTAATGA TTGAGGAGAG AAAAGGATCA GATTGCTGTT 900 

TTCTACAATG GAGCAGGATA TT6CTGAAGT CTCCTGGCAT ATGTTACCGA ATCAAATA6C 960 

CrrCCAGAGG CTAAGAAATT TCTGTTAGTA AAAGATGTTC TTTTTCCCAA A6CATTTTAT 1020 

TTGAAAGGAT AACTTGTGTT TTGGTTATTT TGTATTCCCA CCTGT G CT OQ TAGATATTAT 1080 
TAACCCATTA GGTAAATACT ATTACAGTOG TGGTTTCTGC A 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 



75 
80 
85 



Seq ID NO: 296 Protein sequence: 
Protein Accession S t Bos sequence 

1 11 21 31 41 51 

I I I i i ] - 

MTDKTEKVAV OPETVPKRPR ECDSPSYQKR QRMALLARKQ GAGDSLIAGS AMSKEKKLMT 60 
GHAIPPSQLD 8QI0DFTGPS XDRMMQKPGS NAPVGGNVTS SFSGDOLBCR ETASSFKSQR 120 
EINADIKRKL VKELRCVGQK YERIFa4LB6 VQ6PTAVRXR FFESIIXEAA RCKRRDFVKH 180 

LKKKLKRMI 

Seq ID HO: 297 DNA sequence 

Itucleic Acid Accession Si Eos sequence 

Coding sequence: 247-815 

1 11 21 31 41 51 

I I I I I I 

ABTGTTOGGC T6GG6CAGGC ACX3CTGTQGC TGGCTACTTC CVn'CCreOC ATCCCCCTTG 60 

GGCCAAACGG GATCGGTGCT TCTGGTGAGA OGCCTCCCCA TGCACATCAC TCCCAGGTGC 120 

CCTAGGGGGC ACATTTCCCA CAACTCCCAG AGGGCAGGTT TCTAGAAAGT GCCACCA6TG 180 

QOGAQGOGCC ACAACTTCAC TGCCATTTTG TGAGGTGCCQ CCGTCTCTCC TCCAGCAAGQ 240 

GAAACAATGA OOGATAAAAC AOAGAAGGTG GCIGTAGATC CTGAAACTOT GTTTAAAOGT 300 

CCCAGGGAAT 6TGACAGTCC TT06TATCAG AAAAGGCAQA OQATGGOCCT GTTGGCAAGG 360 

AAACAAGGAG CAGGAGACAG CCTTATTGCA GGCTCTGCCA TGTCCAAAGA AAAGAAGCTT 420 

ATGACAGGAC ATGCTATTCC ACCCAGCCAA TTGGATTCTC AGATTGATGA CTTCACTGGT 480 

TTCAGCAAAG ATAGGATGAT GCAGAAACCT GGTAGCAAT6 CACCTGTGGG AGGAAACGTT 540 

ACCAGCA6TT TCTCTGGAGA TGACCTAGAA TGCAGAGAAA CAGCCTOCTC TGCCAAAAGC 600 

CAACAAGAAA TTAATGCTGA TATAAAAOGT AAATTAGTGA A6GAACTCG6 ATG06TT6GA 660 

CAAAAATATG AAAAAATCTT CGAAATGCTT GAAGGAGTGC AAGGACCTAC TGCAGTCAGG 720 

AAACGATTTT TTGAATCCAT CATCAAGGAA GCAGCAAGAT GTATGAGACG AGACTTTGTT . 780 

AAGCACCTTA AGAAGAAACT GAAACGTATG ATTTGAGAAT ACTTGTCCCT GGAGGATTAT 840 

CACACCCCAA ATGCATAATC TCATTAATGA TTGAGGAGAG AAAAGGATCA GATTGCTGTT 900 

TTCTACAATG GAGCAGCSITA TTGCTGAAGT CTCCTGGCAT ATGTTAOOGA ATCAACT6GC 960 

CTTCCAGA6G CTAAGAAATT TCTGTTAGTA AAAGATGTTC TTTTTCCCAA AGCGTTTTAT 1020 

TTGAAAGGAT AACTTGTGTT TTGQTTATTT TGTATTCCCA C C I'Cr GC IOG TAGATATTAT 1080 
TAACCCATTA GGTAAATACT ATTACAGTOG TGGTTTCTGC A 



Seq ZD NOt 298 Protein sequence: 
05 / Protein Accession it Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MTDICTEKVAV DPETVFRRFR BCDSPSYQXR QRKAUARKQ GAGDSLIAGS AMSKEKKLMT 60 
7U GHAIPPSQLD SQIDDFTGF8 KDRMMQXP6S HAPVGGNVTS SPSTOL B CR ETASSPKSQQ 120 
EDIADIKRKL VKBLRCV6QK YEKZFEMLEO VQ6PTAVRXR FFESIZKEAA RCKRRDFVKH 180 
LKKKLKRKI 



Seq ZD NOt 299 DMA sequence 

Nucleic Acid Accession 9: Bos sequence 

Coding sequence t 247-815 

1 11 21 31 41 51 

i 1 I t I I 

AGT6TT066C TGGGGCRGGC ACGCTGTGGC TGGCTACTTC CCTTCCTCCC ATCCCCCTTG 60 

GGCCAAAOGG GATCGGTGCT TCTGGTGAGA OGCCTCCCCA TGCACATCAC TCCCAGGTGC 120 

CCTAGGGGGC ACATTTCCCA CAACTCCCAG AGQGCAOQTT TCTAGAAAGT GCCACCAGTG 180 

GGGAOGOQOC ACAACTTCAC TGCCATTTTG TGAGGTQCGQ OOGTCICTOC TCCAGCAA06 240 

GAAACAATGA CCGATAAAAC AGAGAAGGTG GCTGTAGATC CTGAAACTGT GTTTAAAOGT 300 

CCCAGGGAAT GTGACAGTCC TTOGTATCAG AAAAGGCAGA GGATGGCGCT GTTGGCAAGG 360 

AAACAAGGAG CAGGAGACAG CCTTATTGCA GGCTCTGCCA TGTCCAAAGC AAAGAGCTTA 420 

TGACAGGACA TGCTATTCCA CCCA6CCAAT T6GATTCTCA GATTGATGAC TTCACTOGTT 480 



297 



5 
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WO 02/086443 

TCAGCAAMSV TAGGATGATG O^GAAACCTG GTAGCA^lTGC ACCTOrrGGGA 
CCAGCAGTTT CTCTGGAGAT GACCTAGAAT GCAGAGAAAC AGCCTCCTCT 
AACAAGAAAT TAATGCT6AT ATAAAA06TA AATTAGTGAA GGAACTCOGA 
AAAAATAT6A AAAAATCTTC GAAATGCTIG AAGOAGTGCA AG6AC3CTACT 
AACGATTTTT TGAATCCATC ATCAAGGAAG CAGCAAGATG TATGAGA06A 
AGCACCTTAA GAAGAAACTG AAAOSTATGA TTTGAGAATA CTTGTOCCTQ 
ACACCCCAAA TGCATAATCT CATTAATGAT TGAGC5AGAGA AAAGGATCAG 
TCTACAATG6 AGCAGGATAT TGCTGAAGTC TGCXGGCATA TGTTA0C3GAA 
TTGCAGAGGC TAAGAAATTT CTGrTTAGTAA AAQATGTTCT TTTTOOCAAA 
TGAAAGGATA ACTTGTGTTT TGGTTATTTT GTATTC CCRC CIGTOCIGGT 
AACCCATTAG GTAAATACTA TTAOU3T0GT GGTTTCTOCA 

Seq ZD NO I 300 Protein sequence i 
Protein Accession ftt Bos sequence 



GGAAACGTTA 
CCCAAAAGCC 
IGCGTTGGAC 
6CAGTCAG6A 
GACTTTGTTA 
GAGGATTATC 
ATTGCTC3TTT 
T CAACTGG CC 
GOGTTTTATT 
AGATATTArr 



1 11 21 31 41 51 

I I I 1 t I 

HTDKTEKVAV DPETVFKItPR BCDSPSYQKR QRMALLARKQ 6AGDSLIAGS AMSRAKKLMT 
GBAIPPSQU) SQIDDFTGFS XDKMMQKPOS NAPVGQfVTS SPSGDDLECR ETASSPKSQQ 
EZNADZKRRZi VKEI/RCVGQK YEKIFEMZjEG VQGPTAVRKR FFBSZIKBAA RCMRRDFVKH 
LKKKLKRMI 

Seq ID NO: 301 DNA sequence 

Hucleic Acid Accession St Eos sequence 

Coding sequence} 247-812 



AGTGTT0G6C 
6GCCAAACG6 
CCTAGGGG6C 
666AG60Q0C 
GAAACAATGA 
CCCAGGGAAT 
AAACAAGGAG 
TGACAGGACA 
TCAGCAAAGA 
CCAGCAATTT 
AACAAGAAAT 
AATAT6AAAA 
GATTTTTTGA 
ACCTTAAGAA 
CCCCAAATGC 
ACAATGGAGC 
CA6A66CTAA 
AASGATAACT 
CCATTAGGTA 



11 

1 

TGGGGCAGGC 
GATCGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
CC6ATAAAAC 
GTGACAGTCC 
CAGGAGACAfi 
TGCTATTCCA 
TQGGATQATG 
CTCTGGAGAT 
TAATGCTGAT 
AATCTTCGAA 
ATCCATCATC 
GAAACTGAAA 
ATAATCTCAT 
AGGATATTGC 
GAAATTTCT6 
TGTGTmGO 
AA7ACTATTA 



21 
i 

AOGCTGTGGC 
TCTGGTGA6A 
CAACTC0CA6 
T6CCATTTTG 
AGAGAAGGTG 
TTOGTATCAG 
CCTTATTGCA 
COCAGOCAAT 
CAGAAACCTG 
GACCTAGAAT 
ATAAAATGTC 
ATGCTT6AAG 
AAGGAA6CAG 
OGTATGATTT 
TAATGATTGA 
TGAAGTCTCC 
TTAGTAAAA6 
TTATTTTOXA 
GAarOGTGGT 



31 
I 

TGGCTACTTC 
CGCCTCCCCA 
AGGGCA66TT 
TGAGGTGC06 
6CTGTAGATC 
AAAAGGCAGA 
GGCTCIQCCA 
TGQATTCTCA 
GTAGCAATGC 
GCAGAG6AAT 
AAGTAGTGAA 
6A6TGCAAGG 
CAAGATGIAT 
6AGAATACTT 
GGAGAGAAAA 
TGGCATATGT 
ATGTTCTTTT 
TTGCCACCIG 
TTCTGCA 



41 
I 

CCTTCCTCCC 
TGCACATCAC 
TCTA GAAAG T 
C06TCTCT0C 
CTGAAACTGT 
GGATGQCOCT 
TGTCCAAAGA 
GATT6AT6AC 
ACCTGTGGGA 
AGCCTCCTCT 
GGAAATCOGA 
ACCTACT6CA 
GAGAOGAGAC 
6TCCCT6GA6 
GGATCAGATT 
TACCGAATCA 
TCCCAAAGOQ 
T3CTGGTASA 



Seq ID MO: 302 Protein sequence: 
Protein Accession fft Bos sequence 



31 



41 



51 



1 11 21 

1 I I 1 I I 

MTDKTEKVAV DPETVFKRPR BCDSPSYQKR QRMALLARKQ GACTJSLIAGS AMSKEKKIrflT 
Q3AIPP8QLD SQIDDFTGFS KDGMMQKPGS MAPVG6HVTS HFSCSDLBCR GIASSPKSQQ 
EINADIKOQfV VKEIRCLGQy EKZFEMLE6V Q6PTAVRKRP FBSZZXEAAR CKRRDFVKHL 
KKKLKRMI 

Seq ID NO: 303 DNA sequence 

Nucleic Acid Accession #: Bos sequence 

Coding sequence* 247-815 



AGT6TTCG6C 
GGCCAAACAG 
CCTAGG6GGC 
GGGAGGOGCC 
GAAACAATGA 
CCCAGGGAAT 
AAACAAGGAG 
TGACAGGACA 
TCASCAAAGA 
CCAGCAGTTT 
AACAAGAAAT 
AAAAATATGA 
AAOGATTTTT 
AGCACCTTAA 
ACACCCCAAA 
TCTACAATGG 
TTCCAGAGGC 
TGAAAGGATA 
AACOCATTAG 



11 
I 

TGGGACA6GC 
GATCGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
COGATAAAAC 
GTGACAGTCC 
CAG6AGACAG 
TGCTATTCCA 
TAGGATGATG 
CTCTGGAGAT 
TAATGCTGAT 
AAAAATCTTC 
TGAATCCATC 
GAAGAAACTG 
TGCATAATCT 
AGCAGGATAT 
TAAGAAATTT 
ACTTGTGTTT 
GTAAATACTA 



21 

I 

ACGCTGTGGC 
TCTGGTGAGA 
CAACTCCCAG 
TOCCATTTTG 
AGAGAAGGTG 
TTOGTATCAG 
CCTTATTGCA 
CCCAGCCAAT 
CAGAAACCTG 
^CCTAGAAT 
ATAAAAOGTA 
GAAATGCTTG 
ATCAAGGAAG 
AAAOGTATGA 
OGTTAATGAT 
TGCTGAAGTC 
CT6TTAGTAA 
TGGTTATTTT 
TTACAGTC6T 



31 
\ 

TGGCTACTTC 
OGTCTOCCCA 
AGG6CAGGTT 
TGAGGTOCCG 
GCTCTAGATC 
AAAAGGCAGA 
GGCTCTGCCA 
TGGATTCTCA 
GTAGCAATGC 
GCAGA6AAAC 
AATTAGTGAA 
AAGGAGTGCA 
CAGCAAGATG 
TTTGAGAATA 
TGAOGAGAGA 
TCCTGGCATA 
AAGATGTTCT 
GTATTCCCAC 
GGTTTCTQCA 



41 



CCTTCCTTCC 
TGCACATCAC 
TCTAGAAAGT 
COGTCTCTCC 
CTGAAACTGT 
GGAT66CCCT 
TGTCCAAAGC 
GATT6ATGAC 
ACCTGTGGGA 
AGCCTCCTCT 
GGAACTCOGA 
AGGACCTACT 
TATGAGAOGA 
CTTGTCCCTG 
AAAGGATCAG 
TGTTACOGAA 
TTTTCCCAAA 
CTGTGCTGGT 



51 

1 

ATCCCCCTlXi 
TCCCAGATGC 
GCCACCAGTG 
TCCAGCAAGG 
GTTTAAAC6T 
GTTGGCAAOO 
AAAGAGCTTA 
TTCACTGGTT 
GGAAAOGTTA 
CCCAAAAGCC 
TGCGTTGGAC 
GCAGTCAG6A 
GACTTTGTTA 
GAGGATTATC 
ATTGCTGTTT 
TCAACTGGCC 
GOGTTTTATT 
AGATATTATT 



540 
600 
660 
720 
780 
840 
900 
960 
1020 
lOBD 



PCT/US02/12476 



60 
120 
180 



51 
1 

ATCCCCCTTG 
TCCCAGGTGC 
GCCACCAGTG 
TCCAGCAAGG 
GTTTAAAC6T 
GTTGGCAAGO 
AAAGAGCTTA 
TTCACTGGTT 
GGAAATGTTA 
CCCAAAAGCC 
TGCCTTOGAC 
OTCAGGAAAC 
TTTGTTAAGC 
6ATTATCACA 
GCTGTTTTCT 
ACTGGOCTTC 
TTTTATTTGA 
TATTATTAAC 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



Seq ZD MOt 304 Protein sequence t 
Protein Accession St Eos sequence 



298 



wo 02/086443 

I 11 21 31 41 51 

I I I I 1 t 

MTDKTEKVAV DPETVFKRPR ECDSPSYQKR QSMALIAEKQ GW3DSLIA6S AMSKAKKLMT 60 

CSAIPPSQU} SQXDDFTGP3 KDRMKQKPGS UAPVGGNVTS SFSGDDLBCR ETASSPKSQQ 120 

BIKADIKRKL VXELRCVGQK yBKJFE34LEG VQ6PTAVRKR FFESIIKEAA ROffiRDFVKH 180 

XiKKKLKRMI 

Seq ID NO: 305 DNA sequence 

Nucleic Acid Accession 8i Eos sequence 

CSDding sequence t 07>689 

1 11 21 31 41 51 

I 1 I i 1 I 

OGTCGAGGCA GCTAGCGCGA GGCTGGGGAG OGCTGAGCOG OGCGTOSTGC CCTGCGCTGC 60 

CCAOACTAGC GAACAATACA GTCAGGATGG CTAAAGGTGA CCCCAA6AAA CCAAAG6GCA 120 

AGATGTCOGC TTAT6CCTTC TTT6TGCAGA CATGCAGAGA AGAACATAAG AA6AAAAAGC 180 

CAGAGGTCCC TGTCAATTTT GCGGAATTTT CCAAGAAGTG CTCTGAGAGG TGGAAGACGA 240 

TGTCCGGGAA AGAGAAATCT AAATTTGATQ AAATGGCAAA G6CAGATAAA GTGCGCTATG 300 

ATCGGGAAAT GAAGGATTAT GGACCAGCTA AGGGAQGCAA QAAGAAGAAG GATCCTAATG 360 

CTCCCAAAAG GCCACCGTCT GGATTCTTCC TGTTCTGTTC AGAATT OOGC CCC AAGATC A 420 

AATCCACAAA CCCCGGCATC TCTATTGGAG ACX5TGGCAAA AAAGCTGQGT GA6ATGTGGA 480 

ATAATTTAAA T6ACAGTGAA AAGCAGCCTT ACATCACTAA GGCGGCAAAG CTGAAGGAGA 540 

AGTATGAGAA GGATGTTGCT GACTATAAG7 OGAAAGGAAA GTTTGATGGT GCAAA6GGTC 600 

CTGCTAAAGT TGCCCGGAAA AA06TGGAAG AGQAAQATGA AGAAQA6GAG GAGGAAGAAG 660 

AGGAGGAGGA GGAGGAGGA6 GATGAATAAA GAAACTGTTT ATCTGTCTCC TTGTGAATAC 720 

TTAGAGTAGG GGAGCOCCGT AATTGACACR TCTCTTATTT GASAAGTGTC TGTTGCCCTC 780 

ATTAGGTTTA ATTACAAAAT TTGATCACGA TCATATTGTA GTCTCTCAAA GTGCTCTAGA 840 

AATTGTCftGT GGTTTACATG AAGTG6CCAT GGGTOTCTGG AGCACCCTGA AACTGTATCA 900 

AAGTTGTACA TATTTCCAAA CATTTTTAAA ATGAAAAGGC ACTCTOGTGT TCTCCTCACT 960 

CTGTGCACTT TGCTGTTGGT GTGAC3U«3GC ATTTAAAGAT GTTTCTGGCA TTTTCTTTTT 1020 

ATTTGT7UU5G TGGTGGTAAC TATGGTTATT GGCTAGAAAT CCTGAGTTTT C AACTG TATA 1080 

TATCTATAGT TTGTAAAAAG AACAAAAC3UI OOGAGACAAA COCTTGATGC TCCT TG CTOG 1140 

G0GTTGAG6C rSTGOGGAAO ATSCCTTTT6 GGASA06CI6 TAOCTOIOSG OSTS CACr ST 1200 

OAGGCTGGAC CTCTTGACTC TGCAGGGG6C ATCCATTTAG CTTCAGGTTO TCTTGTTTCT 1260 

GTATATAGTG ACATAGCATT CTGCTGCOVT CTTAGCTGTQ GACAAAGOQG GGTCAGCTGG 1320 

CATGAGAATA Trmrrm ' TAAGTGCXSGT AGTTTTTAAA CTGTTTGTTT TTAAACAAAC 1380 

TATAGAACTC TTCATTQTCA GCAAA6CAAA GAQTCACTQC ATCAATGAAA GTTCftAGAAC 1440 

CTCCTGTACT TAAACAOGAT TG8CAACGTT CTGTTATTTT TTTTGTATGT TTASAATSCT 1500 

GAAATGTTTT TGAAGTTAAA TAAACA6TAT TACATTTTTA AAACTCTTCT CTATTATAAC 1560 

AGTCAATTTC TGACTCACAG CAGTGAACAA ACCCCCACTC CATTGTATTT GGAGACTGGC 1620 

CTCCCTATAA ATGTGGTAGC TTCTTTTATT ACTCAGTGQC CAGCTCACTT AGGGCTGftGA 1680 

TGAAGGAGAG GGCTACTTGA AGCTACTGTO TGATTTTGTT TGTGTCTGAG TGGCATTCAG 1740 

ATGAAGTCIG GAGGAGTTAa OAGAAOQACA TAG6CAA6GT TCA6CAGCCT TCCAAQ8TAT 1800 

AGQAA6GT60 GTGATTA66A CTGAGGCTAT CTAGGTTTAA CTTTTGTCCC ACCTCCACCC 1660 

CXrrATTrPGT GGGGCCAAAT GCATTGCTAA ACAGCAATTT CASAGTGTAT GGTGTGTCAA 1920 

AAATTAAGGC CTTATTGTTT TTCTCTTTCA CCCCTACCCC C0GTX5CTCCT GGCACATATC 1980 

ACATTATTTG TGGT6CCCAA CATTTGGG6T CTTGA6CCT6 CTGCTGGTCT CCTGGATGCC 2040 

AGTGAGGGTA TOTQGGATGO GOTGGTGGGO TAGGG6A0GG TATGCTTTTT TTGCTCCTAC 2100 

7TGGAAACAC CAAACflCCCC AAGOAAGATG ATAGGCTCCA TCTTGG6CCA CCTOAGCTAT 2160 

AGGGCAGGCT AATGGAATCA ACCATTTCTO AGCACTAAAT GTATCATGAA AAGTTGAATG 2220 

GCCTGCTCAT AAGTTTAGCT CATTCACTGG AAATGTAQAT TGATGTTCAA TGTTAAACTG 2280 

GAAGGAGCTT GGTTT6TGTG TCAGIGGTTA TATTAGT6Q0 TA6TG7AACA T TTTATC CAG 2340 

GTTGGGGTGA OGGGAOATGG CCACAQTA6C AAGTGGTGAC ACTAAATACC ATTTTGAAGG 2400 

CT6ATGT0TA TATACATCAT TACTGTCOGT AGCAATGAAG GATACAGTAC TOTOTTOTGO 2460 

GTGAGTGTTO CTATTGCCCA GCATTAATAT TTGGGTGTGT ATGTTTGAGG CTATGAAACA 2520 

CGCAGGAGTQ TTTTTGTGCT ATTAATTTTA AGAGAAAOCa GCTTTTTCTT AAAATTCACT 2580 

GTTGA6AAAC TTGCATGTCT GGAGGCGGTG TCCTCTCOQC CXTCTOGGGT CCTGGATGAG 2640 

TAOSAGTTAT GGTCA08GTC ACAGOCTQAT CTCTTAT6TQ TTCATAGCCA TT06CTCTCC 2700 

CATCAGAACT 6TTTGT0CTG AATGTGTTOC TCTAGTTCTA 6AAAAT6ACC ACTAATTTAA 2760 

AAAACTOGGT TGTGAGQTTT GCCCAGAGGC ACTTGTTOCA 6AATTTCCCC TCCTGCTTCA 2820 

GCCATGTCCT T6TCACTTGG CATTCTAAGC TAAAGCTTTA GCTTCCCAAT TCGTQATGTG 2880 

CTAGGCCAAS ATT06GQA6C TGTTGCCA6C CTCGTCAAAT ATGGAAGAGA AACAACCT6C 2940 

GGTCSkAAAGG GA6TGATTT6 TTAABrGGTO OGOGTCTATC TCATAACTAG ATGTACCAAC 3000 

CAGGGAAGG6 CCAA6GATGG AAAGGGGIAA CTTTTGTGCT TGCAAAGTAG CTAA6CAGAA 3060 

GTGQGGGAGC AGTTTAOCCA GATGATCPTT GATTAGGCAA ACATTGAGTT TTAAAGAGGC 3120 

TGTGAAGTTG AGGCCACTTG GTCCATTAGC TGGGGCAGCA AGATCACTAC TCAAOGTTTT 3180 

CACACTGTGG CAAGATTGCT CTTCTAGTOG AATAATGCCC TAGTTTCTCT GAGATOATGT 3240 

AA6TGGCATG ATGTTACCTA AGGCTTAGGC TTAGCTTGAT TTCTGG6CCC ACTGTCTGTG 3300 

TTCTTAAGAT GCCAACCTGT TGCTTTTTTT TTTTTTTTOC OCCATTTAAA A6GATAGTAC 3360 

CTACTCCCTC TAACCACXZTC ACCCCATTCT TGAATGACAT TTTATCCTTC G6AAAGAACA 3420 

AG6CTGTGAT GTAGTGACTA TTGTCTGTGT CTCXnxrrGTO TGTCTGrrCT TGTCACAAAT 3480 
GTATTTGGGG AOGTTGC^TG CATTCATTTT CTG7AATAAA G 

Seq ID HOi 306 Protein sequence: 
Protein Accession fit NP_005333.1 

1 11 21 31 41 51 

I I I I I I 

KAKSDPKKPR GKMSA^AFFV QTCREEHKXK NPEVPVKFA8 FSKXCSESOtK 7KS6XEXSKP 60 

DEKAKADKVR YDREMKDYGP AKG6KKKKDP NAPKRFPSGF FLFCSBPRPX IRS1NFGISZ 120 

GDVAKKLGBf VDHmJSDSEKQ FYZTKAAKLK BinrBRDVADY KSX6KFD6AK GPAKVASKKV 180 
DE 



Seq ID HOt 307 DMA sequence 
Nucleic Acid Accession 8t MN_022342 
Coding sequences 1..217B 



299 



wo 02/086443 



1 11 21 31 41 51 

I I i I I I 

ATGGGTACTA GGAAAAAAGT TCATGCATTT GTCOGTGTCA AACCCACOGA TGACTTTGCT 60 

CATGAAA7GA TCA6ATA0GG AGATGACAAA AGAAGCATTG ATATTCACTT AAAAAAAGAC 120 

ATTOSGAGAG GAGTTGTCAA TAAOCAACAG ACASACIGGT OGTTTAASTT GGATGGAGTT 180 

TTCA0GAT6 GCTCCCM36A CTTQGTTTAT GAGACWCTTG CRAAGGATGT GeTTTCTCRG 240 

CCXrrOCSATG GCTATAATGG CACX3VTCATX3 TGTTATC5GGC AGAOGGGAGC TGGCAAGACA 300 

ACACCATGA TGGGGGCAAC TGAQAATTAC AAGCACOGGG GGATCCTCCC TOGTGCCCTG 360 

A6CAGSTTT 7TAGGATGAT 06AAGAA0GC CCGACACATG CCATCACTGT GOGTGTTTCC 420 

ACTTGGAAA TCTATAATGA GAGOCTCTTT GATCfOCTCT CCACTCTGOC CTAT6TT6GA 480 

CCTCAOTCA CACCAATQAC CATCGTGGAA AAOCCTCAAG GAGTCTTCAT TAAGG6CTTG 540 

CAGTTCACC TC3VCAAGTCA GGAGGAGGAT GCATTCAGCC TCCTTTrTOA GOOTG AGAC C 600 

ACAGGATTA TAGCCTCCCA CACTATGAAC AAAAACTCTT CCA6ATCACA CTGCATTTTC 660 

CCATCTACT TA6AGGCCCA TTOCOGGACC TTATCAGAGG AAAACTACAT CACTTCCAAA 720 

TTAACTTGG TGGATCTGGC AGGCTCAGAG A06CTGGG6A AGTCTGGGTC TGAGGGOCAA 780 

TCCTGAA6G AAGOCACCTA CATCAACAAA T06CTCTCAT TCCTGGAGCA G6CCATCATT BAX> 

CCCTTGGGG ACTAGAACCG GGAOCACATC CCCTTTOGGC AGTGCAAGCT CACCCAOGCT 900 

TGAAGGACr CGTTAGGGGG AAACPGCAAT ATGGTCCTC3G TQACAAACAT CTATGGAGAA 960 

CTGCCCAGT TAGAAGAAAC GCTATCTTGA CTGAGATTTG CCAGCAGGAT GAAGCTAGTC 1020 

CCACTGAGC CTGCCATCAA TQAAAAGTAT GAT6CTGAGA GAATGGTCAA GAACCTGGAG 1080 

AOGAACTAG OVCTACTCAA GCAGGAGCTG GCTATCCATG ACAGCXITGAC CAACOOCAOC 1140 

TTGTGACCT ATGACCCCAT GGATGAAATC CAGATTGCTO AGATCAACTC CCAGGTGOGG 1200 

GGTACCTGG AGGGGACACT GGAOGAGATC GACATAATCA GCCTTAGACA GATCAAGGAG 1260 

TGTTCAACC AGTTCCGGGT G6TTCTGAGC CAACAGGAAC AGGAAGTGGA GTCCACTTTG 1320 

GCAGGAAGT ACACXXTCAT TGACAGGAAT GACTTTGCAG CXATTTCTGC TATCX»OAAO 1380 

OQGQGCTTG TGGATGTTGA TGGCXyiCCTA GTGGGTGAGC CTGAAGGACA AAACTTTGGA 1440 

TOQGAGTCG OCCCTTTCTC TACCAAACCT GGGAAGAAAG CCAAGTCCAA GAAGACATTC 1500 

AAGAGCCAC TCAGGCCOGA CACCCXZACCC TCCAAACCAO TGGCXTTTGA GGAGTTTAAG 1560 

ATGAGCAAG GTAGTGAGAT CAACCGAATT TTCAAAGAAA ACAAATCCAT CTTGAAT6AA 1620 

GGAGGAAAA GGGCCAGCQA GACCACACAG CACATCAATG CX^TCAAGOG GGAGATTGAT 1680 

TGACCAAGG AGGCCCT6AA TTTCCAGAA6 TCACTAOGGG AGAAGCAAGG CAAGTAOGAA 1740 

ACAAGGGGC TGATGATCAT 06ATGAGGAA GAATTCCXGC TGATCXTCAA GCTCAAAGAC 1800 

TCAAGAA6C AGTACOGCAG OSAGTACCAG GACCTGCGT G AOCTCAGGGC TGAGATCCAG 1860 

ATTGCCAGC ACCTAGTGGA TCAGTGTCGC CACOGCCXGC TCATQGAATT TGACATCTGG 1920 

ACAATGAGT CXrTTTGTCAT CCCTGAGGAC ATGCAGATGG CACTGAAGCC AGGCGGCAGC 1980 

TCCGGCCAG GCAT6GTCCC TGTGAACSUSG ATTGTGTCTC TGGGAGAAGA TGACCAGGAC 2040 

AATTCAGOC A6CTGCA6CA GAGGQTGCTT CCTGAGGGCC CTGATTCCAT CTCCTTCTAC 2100 

ATGCCAAA6 TCAAGAXAGA GCAGAA6CAT AATTACTTGA AAACGAT6AT GGGCCTCCAG 2160 
AGGCACATA GAAAATAG 



Seq ZD NO: 308 Protein sequence: 
Protein Accession St NP_071737 



1 11 21 31 41 51 

) 1 I I I I 

MGTRiaWHAP VRVKPTDDFA HEMIRYGDOK RSIDIHLKKD IRKQWKHQQ TDWSPKLD6V 60 

LHDASQDI*VY BTVAKDWSQ ALDGYMOTZM CYGOTGAGICT YTMMGATBnr KHSGZLPRAL 120 

QQVPRMIBBR PTHAITVRVS YIiEIYMBSLP DLLSTLPYVG PSVTPMTIVB NPQOVFIKGL 180 

SVHLTSQEED APSLLFEGET NRIIASHTKM KN8SRSHCIF TIYLEAHSRT LSEERYITSK 240 

INLVDLAGSB RLGKSGSBGQ VLKEATYINK SLSPLEQAII ALGDQKRDHI PFRQCKLTHA 300 

LKDSLGQfCN MVLVTNIYGB AAQLEBTLSS LRFASRMKLV TTEFAZHEKY DAERMVKNLE 360 

XELAIUXQBL AIBDSLTKRT FVTYDFKDEI QIASIHSQVIL RYLSGTIAEI DZZSLRQIKB 420 

VFNQFRWZiS QQEQEVESTL RRKYTLZDSN DFAAZSAZQK AGLVDVDGBL VGEFEQQNFG 480 

LGVAPFSTKP GKKAKSKKTF KEPLRPDTPP SKPVAFEBFK NEQGSBZNRI PKENKSILNE 540 

RRKRASETTO HINAIKREID VTKEAIjNFQK SLREKQGKyE NKGLMIIDEE EFLLII.KIjKD 600 

LKROYRSEYQ DLSDLRAEZQ YCQHLVDQCR HRLLMEFDZN YNESFVZPED MQMALKPGG8 660 

IRPGMVPVKR IVSL6BDDQD KFSQLQQRVL PBQPDSZSFY KAKVKIEQRH NYZiKTimGLQ 720 
QAKRK 



Seq ZD KO: 309 DNA sequence 

Nucleic Acid Accession ft: CAT cluster 



1 11 21 31 41 51 

1 I 1 I ! I 

TrmTrm ttttttttaa tgcct g cigt C3itgctctgt ctaccagggt qaatttccaa 60 

AAATTTCTGC ATAGCAATTT TAGCCAAAAC TATATATGTT CTGGGGAGGA TAGGCATAGG 120 

CACATTGAA6 ACCAAAGGAA AGAGTQAAOA AGTGTAGTTO 66TCATTGTG AATG6AT6TT 180 

TAGAITGTCA AGAAAAGTGG GCCA6A0GCC CCACCTCACA CTAGGAGGGC AATTOCCTCT 240 

CATTAQTATC TCAGGCACCA TGGGTCTTAT TT6GTGTCAT AAGAAACACC CTCAACAAAQ 300 

TAATGAACCC TCAOCCTCCA GCTTCTCTTC TTOGGGATTC TTCTTAGGGC CTCCTTTTTC 360 

CTTTTATGTT TCCAGTACCC TGAATTTCTT ATT0C3CATCC CCCATTAAAA TCTGCTTCAA 420 

AGAAAAAACA AGAAGGACAC ATTCACTTTA AGATOCAAAT 6AATGATAAG A6CTTAAAAC 480 
ATTATACTTA TCAGTATTAT TTOCATTTTT ATAGAAACCA AAAGCATATT TCAACAAC 



Seq ID KO: 310 DHA sequence 

Bucleic Acid Accession ft: NM_018622.2 

Coding sequence: 1-1140 



1 11 21 31 41 51 

I i i I I I 

ATQGOSTGGC GAGOCTGGGC GCAGAGAGGC TGGGGCTGOG GCCAG60GTQ GGGTGOSTCG 60 

GTGGGCGGCC GCAGCTGCGA GGA6CTCACT GCGGTCCTAA CCCGGCOGCA 6CTCCT05GA 120 

OGCAGGTTTA ACTTCTTTAT TCAACAAAAA TQOGGATTCA GAAAAGCAOC CAGGAA6GTT 180 

6AACCT05AA GATCAGACCC AGG6ACAAGT GGT6AAGCAT ACAAGAGAAO TGCTTlXiATT 240 

OCrCCTOTQO AAGAAACAGT CTTTTATCCT TCTCCCTATC CTATAAGGAG TCTCATAAAA 300 

CCTTTATTTT TTACVGTTG6 GTTTACAGGC T6TGCATTTG 6ATCAGCT6C TATTTGOCAA 360 



300 



wo 02/086443 

TATGAATCAC TGAAATCCAO GGTCCA6AGT TATTTTQATO GTATAAAAOC IGATTGGTTQ 420 

GATA6CATAA GACCACAAAA AGAA6GAGAC TTCASAAA63 AGATTAACAA GT3QTG6AAT 480 

AACCTAAGTG ATGGGCAGOS GACTGTGACA GGTATTATA6 CTGCAAAT6T CCTTGTATTC 540 

TGTrrATGGA GAGTACCTTC TCTGCAGOGG ACAATGATCA GATATTTCAC ATOGAATCCA 600 

GCCTCAAAGG TCCTTTGTTC TCCAATGTTG CTGTCAACAT TCAGTCACTT CTCCTTATTT 660 

CACAT06CAG CAAATATGTA TOmTG T GU AGCTTCTCTT CCa GCATAQ T GAACAT TCIQ 720 

GGTCAA6A6C AGTTCATGGC AGTBTACCTA TCT6CA0GTG TTATTTCCAA TTTTGTCAGT 760 

TACCTGGGTA AAGTTGCCAC AGGAAGATAT GGACCATCAC TTGGTGCATC TGGTGCCATC 840 

ATGACAGTCC TOGCAGCTGT CTGCACTAAG ATCCCAGAAG GGAGGCTTGC CATTATTTTC 900 

CTTCOGATGT TCAO GT T CA C AGCAGGGAAT GCCXTTGAAAG CCATTATOGC GATGGATACA 960 

GCAG6AATGA TCCTGGGATG GAAATTTTTT GATCATGG6G CACATCTTGQ GQGAGCTCTT 1020 

■ m tS G AATAT GGTATGTTAC TTACGGTCAT GAACTGATTT GGAAGAACAO OQAGCOGCTA 1080 
GTGAAAATCT GGCATGAAAT AAOGACTAAT GGCCX3CAAAA AAG6A0G7OQ CTCTAA67AA 



Seq ID KOt 311 Protein sequence i 
Protein Accession fit NP_061092.2 

1 11 21 31 41 51 

I I I I I I 

HAHRGHAQRO H60GQAWGAS VGGRSCBELT AVLTPPQUiG RRFNFFZQQK 06FRXAFRKV 60 

EPRRSDPGTS GEAYKRSALI PPVEETVPYP SPYPIRSIiZR PLPFTV6FTG GAFGSAAIHQ 120 

YESLKSRVQS YFDGIKAOWL DSIRPQKEGD FRKEINKHWN NLSDGQRTVT 6IIAANVLVP 180 

CLHRVPSLQR TmRYFTSNP ASKVLCSPML LSTPSHFSLP HMAANMYVLW SPSSSIVNIL 240 

GQEQFMAVYL SAGVI5NFVS YLGKVAT6RY CPSLGASGAX MTVLAAVCTK IPB6RLAIIP 300 

LPMFTFTAGN ALKAIIAMDT AGMZbGWKFP SHAABLG6AL FGZWyVTyOH BIiIWKNREPI. 360 
VKIHHEIRTN GPKKGGGSK 

Seq ZD NO: 312 DNA sequence 
Nucleic Acid Accession #t NN_000625 
Coding sequence I 195.. 3656 

1 11 21 . 31 41 51 

i I I I I I 

CTCTCGGCCA CCTTTGATCA GQGGACTGGO CAGTTCTAGA CAGTCCCGAA GTTCTCAAGG 60 

CACAGGTCTC TTCCTGGTTT QACTGTCCTT ACCCGGG6GA GGCAGTGCAG CCAGCTGCAA 120 

GCCCXACAGT 6AAGAACATC TGAGCTCAAA TCCAGA7AAG TGACATAAGT GACCTGCTTT 180 

GIAAA6CCAT AGA6ATGGOC TCTCCrTGGA AATTTCTGTT CAAGACCAAA TTCCACCAGT 240 

ATGCAAT6AA TX^GGGAAAAA GGCATCAACA ACAAT6TGGA GAAAGCXXTCC TGTGCCACCT 300 

CCAGTCCAGT GAOICAGGAT GACCTTCAGT ATCACAACCT CAGCAAGCAG CAGAATGAGT 360 

CCC06CA6CC CCT O GTGGAG ACQGGAAAGA AGTCTCCA6A ATCTCTGGTC AAGCT6GATG 420 

CAAOCCCATT GTOCTCCCCA GQGCAT6T6A G6ATCAAAAA CTGGGGCAGC GQGATGACTT 460 

TCCAAGACAC ACTICACCAT AAG6CCAAAQ GGATTTTAAC TTGCAGGTOC AAATGTTGCC 540 

TGGGGTCCAT TATGACTCOC AAAA6TTTGA CCAGAGGACC CAGGGACAAO CCTACCCCTC 600 

CAGATGAGCT TCTACCTCAA GCTATCQAAT TTGTCAACCA ATATTACGGC TCCCTCAAAG 660 

AGGCAAAAAT AGAGGAACAT CTGGCCAGGG T6GAAGCGGT AACAAAGGAG ATAGAAACAA 720 

CAOTAACCTA CCAACT6ACG 0GAGAT6AGC TCATCTTOGC CACCAAGCAO 6CCTGG0GCA 780 

AT6CCCCA0G CTGCATTGGG AOGATCCAGT GGTCCAACCT OCAGGTCTTC GATGGCOGCA 840 

GCTGTTCCAC TGCCOGGGAA ATGTTTGAAC ACATCTGCAG ACA0GTQC3GT TACTCCACCA 900 

ACAATGGCAA CATCAGGTCG GCCATCACXXS TGTTCCCCCA QCGGAGTGAT GGCAAGCACQ 960 

ACTTCCGGOT GT6GAATGCT CAGCTCATOC GCTAT6CTGG CTACCAGATG CXaGATGGCA 1020 

GCATCAGAiSO OGAOCCTOCC AAGGIGGAAT TCACTCAGCT GTGCArOGAC CZtSGGCTGGA 1080 

AGCCCAA6TA 0GGCC6CTTC GAT6T6GTCC CCCTGGTCCT 6CAGGCCAAT GGCG6T6ACC 1140 

CTGAGCTCTT OGAAATCCCA CCTGACCTTG TGCTTGAGGT GGCCATGGAA CATCCCAAAT 1200 

ACGAGTGGTT TC3GGGAACTG GAGCTAAAGT GGTACGCCCT 6CCTGCAGTG OCCAACATGC 1260 

TGCTTGAGGT GGGC6GCCTG GASTTCCCAG G6TQCCCCTT CAATGGCT6G TACATGGGCA 1320 

CAGAGATOQG AGTCCGGGAC TTCTGTQATQ TCCAGGOCIA CAACATCCTG GA0GAAGT6G 1380 

GCAGGAGAAT 6GGCCTGGAA A06CACAAGC TGGCCTC6CT CTGGAAAGAC GAOGCTGTCG 1440 

TTQAGATCAA CATTGCTGTG CTCCATAGTT TCCAGAAGCA GAATGT6ACC ATCATGGACC 1500 

ACCACTCGGC TGCAGAATCC TTCATGAAGT ACATGCAGAA TQAATACCGG TCCOGTGGGO 1560 

6CTGCC0G6C ASACTGGATT T6GCTG6TCC CTCCCATGTC TGGGAGCATC ACCCCCGTGT 1620 

TTCACCAGQA QATOCTGAAC TA08TCCTST CCCCTTTCTA CTACTATCAG GTA6AGGCCT 1680 

GGAAAACCCA TGTCTG6CAG GACGAGAAGC GGAGACCCAA GAGAAGAGAG ATTCCATTGA 1740 

AAGTCTTGGT CAAAGCTGTG CTCTTTGCCT GTATGCTGAT GCGCAAGACA ATGG06TCCC 1800 

GAGTCAGAGT CACCATCCTC TTTGCGACAG AGACAGGAAA ATCAGAGGCG CTGGCCTQGG 1860 

ACCTGGGGGC CTTATTCAGC TGT6CCTTCA ACCCCAAGGT TGTCT6CATG GATAAGTACA 1920 

G6CTGAGCTG CCTGQAGGAG GAAOGGCTGC T0TT6QTG6T GACCAGXAOO TTTGOCAATO 1960 

GAGACTGCCC TGQCAATGGA GAOAAACTQA AGAAAT06CT CTTCATGCT6 AAA6A6CTCA 2040 

ACAACAAATT CAGGTACGCT GTOTTTGGCC TOCGCTCCAG CATGTACCCT CGGTTCTGCG 2100 

OCTTTGCTCA TGACATTGAT CAGAAGCTGT CCCACCTGGG GGCCTCTCAG CTCACCCCGA 2160 

TGGGA6AAGG G6AT6AGCTC A6TGGGCAGG AGGAOGCCTT COGCAGCTGG 6C0QT6CAAA 2220 

OCITCAAGGC AGOCTGT G AG ACGTTT6ATG TOOQAGGCAA ACAQCACATT CAOATCCCCA 2280 

AGCTCTACAC CTGCAATGTG ACCTGOGACC C6CACCACTA CAGGCTCGTG CAGGACTCAC 2340 

AGCCTTTGGA CCTCAGCAAA OCCCTCAGCA GCATGCATGC CAAGAACGTG TTCACCATGA 2400 

OGCTCAAATC TCGGCAGAAT CTACAAAGTC OGACATCCAG CCGTGCCACC ATCCTGGTGG 2460 

AACTCTCCTG TGAGGATGGC CAAOGCCTGA ACTACCTGCC 6GGGGA0CAC C ri 'U U Jb'' tTr 2520 

GGCCAGGCAA GCAGCOGGOC CTGGTCCAAO GCATCCTGGA GGGAGTGGTG GATG6CCCCA 2580 

CACCCCACCA GGCAGTGCGC CTGGAG6CCC TGGATGAGA6 T66CAGCTAC TG6GTCAGTG 2640 

ACAAGAG6CT GCCCCCCTGC TCACTCAGCC AGGCCCTCAC CTACTTCCTG GACATCACCA 2700 

CACCCCCAAC CCAGCTGCro CTCCAAAAGC TGGCCCAGGT G6CCACAGAA GA6CCT6AGA 2760 

GACAGAG6CT GGAGGCCCTG T6CCAGCCCT CAGAGTACAG CAAGTGGAAG TTCACCAACA 2620 

GCCCCACATT CCTOGAGGT6 CTAGAG6AGT TCCOGTCCCT GCGGGTGTCT GCIQGCTTCC 2880 

TGCTTTCOCA GCTCCCCATT CTGAAGCCCA GGTTCTACTC CATCAGCTOC OCOOGGGATC 2940 

ACACGCCCAC GGAGATCCAC CTGACTGTGG CCGTGGTCAC CTACCACAOC C6AGATGGCC 3000 

AGGGTCCCCT GCACCAOGGC GTCTGCAGCA CATGGCTCAA CA6CCTC3UU3 CCCCAAGACC 3060 

CAGTGCCCTG CmVit i CGU AATGCCAGOG GCTTOCACCT CCCCGAGGAT CCCTCCCATC 3120 
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CTTGCATGCT CATGQGOCCT GQCACAG6CA TOGOQCCCTT OOGCAG'lTl'C TGGGAGCAAC 3180 

G6CTCCATGA CTCCCAGCAC AAGGGAGTGC OGGCSAGGC06 CKTGACCTT6 GTGTTTGGGT 3240 

GCCX3CCGCCC AGftTGAGGAC CACATCTACC AGGAGGAGAT 6CT06AGATG GCCCAGAAG6 3300 

GGGTGCTGCA TGCGGTGCAC AC3VGCCTATT CCCGCCTGCC TXXSCAAGCXX: AAGGTCTATG 3360 

TTCAGGACAT CCT6GG6CAG CAGCTGGCCA GGGAGGTGCT CGGTGTGCTC CACAAGGAGC 3420 

CA6GCCACCT CTATGTTTGC 6GGGATC2TGC GCATGGCC06 QGAGGIOGCC CACACOCTGA 3480 

A6CAGCTG6T G6CTGCCAAG CTGAAATTGA ATGAG6AGCA GGTOGAOGAC TATTTCTTTC 3540 

AGCTCAAGAG CCAGAA60GC TATCACX5AAG ATATCTTTGG TGCTGTATTT CCTTAOGAGG 3600 

CGAAGAAGGA CAGGGIXSGOS GTGCAGCCCA GCAGCCTGGA GAT6TCAG0G CTCT6AGGGC 3660 

CTACAGGAGG GGTTAAAGCr 6GC36GCACAG AACTTAAGGA TGGAGCCASC TCTGCATTAT 3720 

CXGAGOTCAC AG G GCCTGGG 6A6ATGGAGG AAAGTGATAT OOCCCAGCCT CAAGTCTTAT 3760 

TTCCTCAACX5 TTGCTOCCCA TCAAGCXXTT TACTTGAOCT CXTAACAAOT A6CAC0CTGS 3840 
ATTGAT06GA GOCTC 

Seq ZD NO: 313 Protein sequence: 
Protein Accession 9t NP_000616 

1 11-21 31 41 51 

i ) I 1 I i 

MACPHKFLFK TKFHQYAMNG EKGIMMNVEX APCATSSPVT QDDLQYHNLS XQQNESPQPL 60 

VETGKKSPES LVKLDATPLS SPRHVRZKim GSGMTFQDTL RRXAXGILTC RSKSCXX3SZM 120 

TPKSLTRGPR DKPTPPDELL PQAIEFVNQY YGSLKEAKIB EHLARVEAVT KEIETTVTYQ 180 

LTCTELIPAT KQAMRNAPRC IGRIQWSNLQ VFDARSCSTA RQIPEHICRH VRYSTNNGNI 240 

RSAITVPPQR SDGKHDPHVW NAQLIRYAGY QMPDGSIRGD PANVEPTQLC IDLGWKPWfG 300 

RFDWPLVLQ ANGRDPELFB IPPDLVIiBVA MEEPKYEWPR ELELKWYALP AVANHLliEVO 360 

GLEPPGCPPM GWYHGTEIGV RDFCDVQRYN ILBEVGRRMG LETHKLASLH KDQAWEINI 420 

AVIBSFQKQN VTIMDHHSAA BSFMKYMQNE YRSRGGCPAD WIWLVPPMSQ SiTPVFHQEM 480 

UlYVIiSPFYY YQVEAHKTHV WQDBKRRPKR REIPLKVLVK AVLFAO^LMR KTMASRVRVT 540 

ILFATET6KS EALAMD1>GAL FSCAPNPKW CKDKYRLSCL EEBRLLLWT STFGNGDCPG 600 

NGEKLKKSLP MLKEUINKFR YAVFGLGSSM YPRPCAFARD IDQKLSHLGA SQLTPKGBGD 660 

ELSGQS)AFR SHAVQTFKAA CETFDVR6KQ HIQIPKLYTS NVTHDPHHYR LVQDSQPIiDL 720 

-5KALS94HAK NVFTHRLKSR QNLQSPTSSR ATILVELSCB DGQGLNYLPG EHIiGVCPGNQ 780 

PALVQ6II£R WDGPTPBQA VRLEALDBSO SYHVSDXRIiP PCSLSQALTY FLDITTPPTQ 840 

LLLQKLAOVA TEEPERQRLE ALCX2PSEYSK NKFTNSPTFL EVLEEFPSLR VSAGFLLSQL 900 

PILKPRFYSI SSPRDHTPTE rHLTVAWTY HTRDGQGPLH HGVCSTWLNS LKPQDPVPCP 960 

VRKASGFHLP SIPSHPCILZ GP6TGIAPFR SPHQQRLHDS QHKGVRGGRM TLVFGCRRPD 1020 

EDHZYQEEML EHAQKGVLBA VRTAYBRLPG KPKVYVQDIb RQQIASEVXiR VI£KEP6HLY 1080 

VG6DVRMARD VARTLKQLVA AKLKLNBEgV EDYFFQLKSQ KRYBEDZFGA VFPYEAKXDR 1140 
VAVQPSSLEM SAL 

Seq ID NO: 314 D2tA sequence 
Nucleic Acid Accession #i XM_087254 
Coding sequence t 47.. 2332 

1 11 21 31 41 51 

I I i I I I 

A6AGTA0GTG TTTACA6ATA AAACTGGTAC ACTGACAGAA AAT6AGATGC AGTTTOGGGA 60 

ATGTTCAATT AATGGCATGA AATACCAAGA AATTAATGOT AGACTTGTAC COGAAGGACC 120 

AACACCAGAC TCTTCAGAAG GAAACTTATC TTATCTTAGT AGTTTATCCX: ATCTTAACAA 180 

CTTATCCCAT CTTACAACCA GTTCCTCTTT CA6AACX»GT CCTGAAAATG AAACTGAACT 240 

AATTAAA6AA CATGATCTCT TCTTTAAAGC AGTCA6TCTC TGTCACACIG TACAGATTAG 300 

CAATGTTCAA ACTGACT6CA CTGGTGATG6 TCCCTGGCAA TCCAACCTGG CAOCATOGCA 360 

GTTGGAGTAC TATGCATCTT CACCAGATGA AAAGGCTCTA OTAGAAGCTG CTGCAAGGAT 420 

TGGTATTGTG TTTATTGGCA ATPCTGAAGA AACTATGGAO GTTAAAACTC TTGGAAAACT 480 

GGAA0G6TAC AAACT6CTTC ATATTCT6GA ATTTGATTCA GAT0GTAG6A GAATGAGTGT 540 

AATT6TTCAQ GCACCTTCAO GTGAGAAGTT ATTATTTGCT AAAGGAGCTO AGTGATCAAT 600 

TCTCCCTAAA TGTATAG6T6 GAGAAATAGA AAAAACCAGA ATTCAT6TAG ATGAATTT6C 660 

TTTGAAAGGG CTAAGAACTC TGTGTATAGC ATATAGAAAA TTTACATCAA AAGAGTATGA 720 

GGAAATA6AT AAAOSCATAT TTGAAGCCAQ GACT6CCTTG CAGCAGCOGG AAGAGAAATT 780 

G6CAGCTGTT TTCCAGTTCA TAGAQAAAGA CCTGATATTA CTTGGAGOCA CAGCAGTA6A 840 

AGACAGACTA CAAGATAAA6 TTCGAGAAAC TATT(3JtfK» TTGAGAATGG CIGGIATCAA 900 

AGTAT66GTA CTTACTGGGG ATAAACAT6A AACAGCTGIT A6TGT6AGTT TATCATGT66 960 

CCATTTTCAT AGAACCATGA ACATCCTTGA ACTTATAAAC CAGAAATCAG ACAOCGAGTG 1020 

TGCTGAACAA TTGAGGCAGC TTGCCAGAAO AATTACA6A0 GATCATGTGA TTCAGCATGG 1080 

GCTGGTAGT6 GATGGGACCA GCCTATCTCT TGCACTCAGQ GAGCATGAAA AACTATTTAT 1140 

GGAAGTTTGC AGAAATTGTT CA6CI61ATT ATGCIGTOGT ATGGCTCCAC TGCA6AAAGC 1200 

AAAAGTAATA AGACTAATAA AAATATCACC TGAGAAACCT ATAACATTGG CTGTTGGTGA 1260 

TGGTGCTAAT GACGTAAGCA TGATACAAGA AGCCCATGTT GGCATAGGAA TCATGQOTAA 1320 

AGAAGGAAGA CAG6CT6CAA GAAACA6TGA CTATGCAATA GCCAGATTTA AGTTCCTCTC 1380 

CAAATT6CTT TTTGTTCAT6 GTCATTTTTA TTATATTAGA ATAGCTACCC TTGTACAGTA 1440 

TTTTTTTTAT AAGAAIGTGT GCTTTATGAC ACCCCAGTTT TTATATCAGT TCTACT6TTT 1500 

GTTTTCTCAG CAAACATTOT ATGACAGOGT GTACCT6ACT TTATACAATA TTTGTTTTAC 1560 

TTCCCTACCT ATTCTGATAT ATA6TCTTTT GGAACA6CAT 6TAGACCCTC ATGTGTTACA 1620 

AAATAAGCCC ACCCTTTATC GAGACATTAG TAAAAACC3QC CTCTTAAGrTA TTAAAACATT 1680 

TCTTTATTGC ACCATCCT6G GCTTCAGTCA TGCCTTTArP TTCTTTmG GATCCT ATTT 1740 

ACTAATAGGG AAAGATACAT CTCTGCTTGG AAATG60OU3 ATGTTT86AA ACTOGACATT 1800 

TGGCACmG GTCFTCACAG TCATGGTTAT TACAGTCAQV GTAAAGATG6 CTCTGGAAAC 1860 

TCATTTTTGQ ACTTGGATCA ACCATCTOGT TACCTGGGGA TCTATTATAT TTTATTTTGT 1920 

ATTTTCCTTG TTTTATGGAG GGATTCTCT6 GCCATTTTTG GGCTCCCAGA ATATGTATTT 1980 

TGTGTTTATT CAGCTCCTGT CAAGTGGTTC TGCTTGGTTT OCCATAATOC TCATQ0TTG7 2040 

TACATGTCTA TTTCTTGATA TCATAAAGAA GGTCTTT6AC G6ACA0CTGC ACCCTACAAO 2100 

TACTGAAAAG GCACAGC7TA CTGAAACAAA TGO^GGTATC AA6TGCTTG6 ACTCCATGT6 2160 

c m rnixu G gaaggaqaag cagcgtgtgc atctgttgga agaatgctgg aaogagttat 2220 

AQGAAGATGT AGTCCAAOCC ACATCAGCAG ATCATGQAGT GCATCGGATC CTTTCTATAC 2280 

CAACGACAGG AGCATCTT6A CTCTCTCCAC AATGGACTCA TCXACrTGTT AAAG6QQCAQ 2340 
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TAGTACTTTG TGGGAGCCAG TTCACCTCCT TTCCTAAAAT TCAGTGTGAT CACCCTGTTA 2400 

ATGGCCACAC TAGCTCTGAA ATTAATTTCC AAAATCTTTG TAGTAGTTCA TACCCACTCA 2460 

GAGTTATAAT GGCAAACAAA CA6AAA6CAT TAGTACAACC CCCTCCCAAC ACOCTTAATT 2520 

TGAATCTGAA C31TGTTAAAA TTTGAGAATA AACAGACATT TTTCATCTCT TTGTCTGGTT 2580 

T QIX XXTTCT OCTTATGGGA CTCCT A ATGG CATTTCAGTC TGTTG CTGAG GOCATTATAT 2640 

TTTAATATAA ATGTAGAAAA AAGAGAGAAA TCTTAGTAAA GAGTATTTTT TAGTATTAGC 2700 

TTGATTATTG ACTCTTCTAT TTAAATCTGC TTCTGTAAAT TATGCTGAAA GTTTGCCTTG 2760 

AGAACTCTAT rrTTTTATTA GAGTTATATT TAAAGCTTTT CATGGGAAAA GTTAATGTGA 2820 

ATACTGAGGA ATTTTGGTCC CTCAGTGACC TGTGTTGTTA ATTCATTAAT GCATTCTGAS 2660 

TTCACAGM3C AAATTAOOAG AATCATTTCX AACCATTATT TACTSCAGTA TGQQGAGTAA 2940 

ATTTATAOCA ATTCCTCTAA CTGTACTGTA ACACaVGCCTG tAAAGTTAGC CATATAAATO 3000 

CAAGGGTATA TCATATATAC AAATCAGGAA TCAOGTODGT TCACCGAACT TCAAATTGAT 3060 

GTTTACTAAT ATTTTTGTGA CAGAGTATAA AGACXXTTATA GTGGGT AAAT TAGATACTAT 3120 

TAGCATATTA TTAATTTAAT GTCTTrATCA TTGGATCTTT TGCATGCTTT AATCTGGTTA 3180 

ACATATTTAA ATTTGCTTTT TTTCTCTTTA CCTGAAGGCT CTGTGTATAG TATTTCATGA 3240 

CATaSTTGTA CAGTTTAACT ATATCAATAA AAAGTTTOGA CAGTATTTAA ATATTGCAAA 3300 

TAT6T1TOAT TATACAAATC AGAATAGTAT OGGTAATTAA ATGAATACAA AAAGAAGAGC 3360 

CTCTT T CT G C AGCCGACTTA GACATGCTCT TOCCTTTCTA TAAGCTAGAT TTTAGAATAA 3420 

AGGGTTTCAG TTAATAATCT TATTTTCAGG TTATX3TCATC TAACTTATAG CAAACTACCA 3480 

CAATACAGTG AGTTCTGCCA GTGTCCCAGT ACAAGGCATA TTTCAGGTGT GGCTGTGGAA 3540 

TGTAAAAAT6 CTCAACTTGT ATCAGGTAAT GTTAGCAATA AAXTAAAT6C TAAGAA TGAT 3600 

TAATOSGGTA CATGTTACTO TAATTAACTC ATTGCACTTC AAAAOCTAAC TTCCATCCTO 3660 

AATTTATCAA GTAGTTCAGT ATTGTCATTT GTnTTGTTT TATTGAAAAG TAATGTTGTC 3720 

TTAAGATTTA GAAGTGATTA TTAGCTTGAG AACTATTACC CAGCTCTAAO CAAATAATGA 3780 

TTGTATACAT ATTAAGATAA TGGTTAAATG COQTTTTACC AAGTTTTCCC TTGAAAATGT 3840 

AATTCCTTTA TGGAGATTTA TTGIGCAGCC CTAAGCTTCC TTCCCATTTC ATG AATA TAA 3900 

GGCTTCTAGA ATTGGACTGG CAOGOGAAAG AATGGTAGAG ACAGAAATTA AGACTTTATC 3960 

CTTGTTTOCT TGTAAACTAT TATTTTCTTG CTAATGTAAC ATTPGTCTGT TCCA GTGATQ 4020 

TAAGGATATT AAGTTATTAA GCTAAATATT AATTTTCAAA AATAGTOCTT CTTTAACTTA 4080 

GATATTTCAT AGCTGGATTT AGGAAGATCT GTTATTCTQO AAGTACTAAA AAGAATAATA 4140 

CAAOGTAGAA TGTCT6CATT CACTAATTCA TOTTOCAGAA GAG6AAATAA TGAAGATATA 4200 

CTC3U3TAGA6 TACTAGGTG6 GAGGATAT6G AAATTTGCTC ATAAAATCTC TT ATAA AAOG 4260 

TGCATATAAC AAAATGACAC CCAGTAGGCC TGCATTACAT TTACATGACC GTGTTTATTT 4320 

GCCATCAAAT AAACTGAGTA CTGACACCAG AC3VAAGACTC CAAAGTCATA AAATAGCCTA 4380 

TGACCAACTG CAGCAAGACA GGAGGTCAGC TOGCCTATAA TGG TGCTT AA AGTGTGATTG 4440 

ATGTAATTTT CT6TACTCAC CATTTGAAOT TA0TTAAGGA GAACTTTATT TTTTTAAAAA 4500 

AAGTAAATGG CAACCACTAG TGTGCTCATC CTGAACTCTT ACTCCAAATC CACTCOGTTT 4560 

TTAAAGCAAA ATTATCTTGT GATTTTAAGA AAAGAGTTTT CTATTTATTT AAGAAAGTAA 4620 

CAATGCAGTC TGCAAGCTTT CAGTAGTTTT CTAGTGCTAT ATTCATCCTQ TAAAACTCTT 4680 

ACTACOTAAC CAGTAATCAC AAGGAAAGTO TCCCCTTTGC ATATTTCTTT AAAATTCTTT 4740 

CTTTGGAAAG TATG A T G TTG ATAATTAACT TACCCTTATC TGOCAAAAGC AGAGCAAAAT 4600 

GCTAAATAOQ TTATTOCTAA TCAGTGGTCT CAAATCGATT TGCCTCCCTT TGCCTOGTCT 4660 

GAGGGCTGTA AGCCTGAAGA TAGTGGCAAG CACX^AGTCA GTTTCCAAAA TTGCCCCTCA 4920 

QCTGCrrTAA GTGACTCAGC ACCCTGCCTC AGCTTCAGCA GGCGTAGGCT CACCCTGGGC 4980 

GGAGCAAAGT ATG6GCCA66 GA6AACTACA GCTAOGAAGA CCTGCIGTOG AGTTGAGAAA 5040 

AGGGGAOAAT TTATGGTCTO AATTTTCTAA CTOTOC r Crr TCTTQGOTCT AAA6CTCATA 5100 

ATACACAAAG GCTTCCAGAC CT6AGCCACA CCCAGGCCCT ATCC7GAACA G6AGACTAAA 5160 

CAGAGGCAAA TCAACCCTAG QAAATACTTG CATTCTGCCC TAOGGTTAGT ACCAGGACTG 5220 

AGGTCATTTC TACTGGAAAA GATTGTGAGA TTGAACTTAT CTGATCX3CTT GAGACTCCTA 5280 

ATAG6CAGGA GTCAAGGCCA CTAGAAAATT GACAGTTAAG AGCCAAAAGT TTTTAAAATA 5340 

TGCTACTCT6 AAAAATCIOS TGAAGGCTGT AGGAAAAGGO AOAATCTTOC ATGTTG8TOT 5400 

TTTTCCTGTA AAGATCAGTT TGG6GTA7GA TATAAGCAOO TATTAATAAA A ATAAC ACAC 5460 

CAAAGAGTTA CGTAAAACAT GTTTTATTAA TrTTGGTCCC OVCXSTACAGA CATTTTATTT 5520 

CTATTTTGAA ATGAGTTATC TATTTTCATA AAAGTAAAAC ACTATTAAAG TGCTGTTTTA 5580 

TOTGAAATAA CTT6AATGTT GTTCCTATAA AAAATAGATC ATAACTCATG ATATOTTTOT 5640 

AATCAT06TA ATTTAGA7TT T7ATGAGGAA TGAGTATCTO QAAATATTGT AGCAATACTT 5700 

GGTTTAAAAT TTTGGACCTG AGACACTGIG GCTGTCTAAT GTAATCCTTT AAAAATTCIC 5760 

ItSCATTGrCA GTAAATGTA6 TAZATTATTG TACAGCTACT CATAATTTTT TAAAGTTTAT 5620 
GAAGTTATAT TTATCAAATA AAAACTTTCC TATAT 

Seq ID NO: 315 Protein sequence: 
Protein Accession 8: XP_087254 



1 11 21 31 41 51 

I I 1 I I I 

KQFRECSING MKYQEIKGRL VPE6PTPDSS BSHLSYLSSL SHTiWHLSHLT TSSSFRTSPE 60 

MBTELIKEHD LFFKAVSLCH TVQISNVQTD CTGDGPWQSM lAPSQLEYYA SSPDBKALVE 120 

AAARIGTVFI GHSEETMBVK TLGKLERYKL LHILEFDSDR RRMSVIVQAP SGEKLLFAKQ IBO 

AESSILPKCI GGEIEKTRIH VDEFALKGI*R TIXTIAYRKPT SKEYEEIDKR IFEARTALQQ 240 

REEKLAAVPQ PIEKDLH.LG ATAVEDRLQD KVRETIEALR MAGIKVWVLT GDKHBTAVSV 300 

SLSCGHFBRT MtflLELINQK SDSBCAEQLR QLARRITEDfi VIQHGLWDO TSLSLALRSI 360 

EKLFKEVCRH CSAVLCCRMA PLQKAKVIRL IKISPERPIT LAVGDGANDV SMIQEAKVGI 420 

GIMGKBGRQA ARNSDYAIAR FKFLSKI«LPV HQHFYYIRIA TLVQYPFYKN VCFITPQFLV 480 

QFYCLPSQQT LYDSVYLTLY NICFTSLPIL IYSU»EQHVD PHVI^NKPTL YRDISKURLL 540 ' 

SIOTPLYWTI LGPSHAPIFP FGSYLLIGKD TSLUaJGQMP (2JWTFGTLVP TVMVITVTVK 600 

MALETBFNTW INHLVTHGSI ZFYFVFSLFY GGZLWPFLGS QNHYFVFIQL l.5SGSAIfFAX 660 

IU4WTCLFL DIXKKVFDRH LHFTSTEKAQ I.TBT1IAGIKC IiDSMCCFPBG BAAC3^SVGRK 720 
liBRVIGRCSP TEISRSWSAS OPFRNDRSZ LTLS1MDSST C 

Seq ID NO: 316 DNA sequence 
Nucleic Acid Accession 8: NN_004473 
Coding sequence t 661 . . 1791 

1 11 21 31 41 51 

I I I I I I 

CTOGCCAOOG GTC0G06GGG CTGGAGACCC AOSCCGTGGA GAGGACXaCC CTCAGGTOGC 60 
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OOCGOCTGGG CC060GCCCC GftOCTCGCTG COCCOBGCTC GGCTCTCT6C CC38TG6C6CT 120 

TACOGCCACC TTGGCXrrOGG G6GCAGGGCA TGGGGGGCXX: COSOCSUSATC GCCCAGOGCC 160 

AGTACTAACT GCCCTCGCTC TGGCXnTCXSA GCC03AAGCC TCTTCIXSOSC GCRCAACCTA 240 

GGCW5TAATC CTAAACTAGC GGGCACCACA GACCAGCTGC AGCCACCCCA ACCCAGGGAT 300 

CACTTCOGGA CTCCTOGACC 6CC0GGCACC AGCGOGCAAG GGAOCXTTTCA GC06GAGACC 360 

AGAGTCCAGT CC0GGT06GG PQGCCACOGC CGCTGCCX^GC CTCGAGAAGC ACAAOQOSGG 420 

CTGAGCX3GT C GGCTAGOGGG TCACTCCCGA GCCTCPGTCT GCAOOGOXX: AGCCCCAGAC 480 

CAOCGACGCT GAGOCTCCAG CGCGOGCCAG CCTGGGCOGC TGGGCTCTCC GGGCCAGCXrC 540 

GC3GAa3ATCC CCTGAGCTCT CCGCAGAAGG GC0GAGCX3TC CGTTCCGGGG ACGCCAGGCC 600 

CGCCCCOGCC CCCOGACAGC GGOGGGGATC CAGAGCC0G6 GGGT SO GG G A OGCCOGOGCC 660 

ATGACTGCOG AGAG00G6CC 6GG6GG6C06 CAGCOGGAGG T6CTGGCTAC 0GT6AAGGAA 720 

GAGCGOGGCG AGAOGGCAGC AGG0GCC6G6 GTCCCAGGGG AGGOCACGGG COGCGG6G0G 780 

GG06G60GGC 6CC3GCAAGCG CCCCCTGCAG OGOGGGAAGC 06CCCTACAO CTACATOGCG 840 

CrCATOGCCA TGGCCATCGC GCACGCGCCC GAOCGCCGCC TCAOGCTGGG CGGCATCTAC 900 

AAGTTCATCA CCGAGC6CTT CCCCTTCTAC OSOGACAACC 0CAAAAAGT6 6CAGAACAGC 960 

ATCOGCCACA ACCTCACACT CAA0GACT6C TTCCTCAAGA TOOOO OQOGA G6CCGGG06C 1020 

CCX3GGTAA6G GCAACTACTG GGOGCTCGAC CCCAACGOGG AG6ACATGTT CGAGAG0G6C 1060 

AGCTTCCTGC GCCGCCGCAA GOGCTTCAAG 0GCT06GACC TCTCCACCTA CCCGGCTTAC 1140 

ATX3CACGACG CGGCGGCTGC CGCAGCCGCC 6CT6C0GCAG CCX3C0GC0GC CGCOGCCGCC 1200 

GCOGCCATCT TCCX3M3GOGC QGTGCCOGCC GOGCGCCCCC CCTACCOGGG CGCCGTCTAT 1260 

GCAGGCTACG CGCOGCOGTC GCTGGCOGOG COGCCTCCAG TCTACTACCC 0GOGGC3BTOG 1320 

CCOGGCXXTTT GC0GCX3TCTT CGGOCTGGTT CCTGAGCGGC OGCTCAGCCC AGAOCTGGOG 1380 

OOOGCAGCGT GQGQGCCaSG 09GCTCTTGC GCCTTTGCCT COGCOGGOGC CCCOGCTACC 1440 

ACCACC06CT ACCAGCCCOC A0GCT6CACC GGGGCCOSGC OGGCCAACCC CTCTGCCTAT 1500 

GC6GCTGCCT ACGOGGGCCC OGACGGGGOS TACCOGCAGG GOGCCGGCAG TGC3GATCTTT 1560 

GCXX5CTGCTG GCOGOCTGGC GGGACCCGCT TOGCCCCCAO CX3GGCGGCAG CAGTGGCGGC 1620 

GTGGAGACCA OGGTOGACTT CTACX3GG0GC AOGTOCCCOG GCCAGTTCGG AGCGCTGGGA 1680 

QQCT6CTACA ACOCtGGOGQ GCMSCTOGGA G6GGCCAGTG CAGGCX3CCTA CXIATGCTCGC 1740 

CATGCTCCOG CTTATCCOGG TQGGATAGAT OGGTTOGTGT CGGCCATGTG AGCCAGCGTA 1800 

GGGACGAAAA CTCATAGACA CATOGGCTGT TCACACXTTTC CCOGCAACCT GAGAAOGAAC 1860 

AGGAATGGAG AGAGGACTCA ACTGGGACXX: AOGTGGAAAA GACOGAGCAG GCCACAGAGG 1920 

CTCX5GTCTCC COOOGCACAG CX5TAGGCACC CTGTGTACTC TGTAAAOGGG AGGA06TGGG 1980 

GGQAGGCM3C CAOAOGCCTT OGACTGSCAC AOQGAOOCTC GATQ6AG0GA AOOOCTCAAA 2040 

OGGGATGCTT TCT06CATTC TATOGGGGAG GGTCCTT GG C OGTAACCAGA GGGCAGOGTA 2100 

GTGTCAACAC CAGAGACCAG GATCCAAATT GTGGGGAATC AGTTTCAGOC TTCCATGTGC 2160 

TGCCGGAACT CGGGCCTTTT TAOGCGGTTC GTCCTCTAGT GCCTTTAACT GCGTTACTAC 2220 

AATAAAAG6C TGOGGCAGCG GCTTTCTTCT TAAAGTGAGG AGGACAAATT TGCAAAAGAA 2280 

ATAGGCTTTT C T lCmTrr AAATTOGAOA AATCTCTGCT CTGOTTGACC TQOQCSQaTS 2340 

TTCCCTGTCT CTGAQAACTT GAGACCTAGC TCOGAGTTQA ACTGTG06TC AGCACTCCAG 2400 

TCCCATCACC TGAACCTTCA GTCTCCCCCA TCTGTTACAC TAGAGGGCTG CAGOACTCTA 2460 

TCCACOSCCC COGGGTTATC ATTCA6GGCC CCATCATCTT GOATGCTGCC CTGC3GTATTT 2520 

GGCAGCAATG GTGGGCCACC CAGGGCXnXTr GAGTAGCCAC CCAAAGCCTA GCC3GCXGT7C 2580 

TAGGGAAC66 AAAAGAGTTC ATGGCCAAGC GTCTTUlOCTA AAGTCCCAGO AT TGGCT CCA 2640 

QGCAGCAATT ATATCATAAC TTATTGAACT TTTGAGCAGG AOGTGCTGGT AATTTCATGG 2700 

CTCTTACTGC OCAGTCATAA ATCTGCTTTT CCATTATAAG GCAGAGAGAA GTACATTCGT 2760 

TCATTTGTCC ACTGTTTCTT GTCATCAC3GC AGCCXTTGOAC CCAAAGQGTO AACTAAAGTT 2820 

TAAGGAGATG A6AGGATTCA AGGAGCCCGT TOQTGAOGCC TTTCAGTAGC TGGGGAGGGC 2 880 

TCTTCCAICC CCAGCACCCC CTGCTACACC TCAGCAGOCT GCCCCATGCA AAA AGQA AAQ 2940 

A6AAAAATTA AGTTAGGGCA GTCAGTAAAG TGAGCTTTAO AAAGAAACTG 6AATTTTAAC 3000 

TTCATTTTGT ATCTTGCTTA AGTAGCAGGC TCACTAAAAT TAGAGAAAGT CCAATAACTC 3060 

TCCCC C TTT C CCTTGAGAAA TCTTTAAGTT TCGATTCTGO AGCAAAAACT TTCAGCATTA 3120 

AATATTTCAG AGGCTCCATT CACAGCTTTC AGATAAACTG GAGTGTTCAC ATGGACTGTT 3180 

TTAATAAAAA TCTTTQAOGA AGTGAGTTA7 GGCAAGAGAA ACTCASOCTC TTTCTGTATA 3240 

AACTTAACAO GGAAG6GCTG 6G6T6TGAAA AAGAA6ATTG TATGAAAAOC A TTGG TAATT 3300 

TTTATTTTTT ATTTTTGQGA CTGCACTATC CTGTTCAaSA AGACATGTGA ACTTG6TTCA 3360 

GTOaAATGG GGATTTGTAT AAACCAGTGC TCTCCATTAG AAATATQGTG CAAGCCACAT 3420 

ATGTAATTTT AAAIATTCTA GTAGCCACAT TAATAAAGTN AAAAQAAACA AAAAAAAAAA 3480 
AA 

Seq ZD NO I 317 Protein sequence: 
Protein Accession #: OT_004464 

1 11 21 31 41 51 

i I I I I I 

FKHLTHYROI DTRANSCRIP TIONFACTOR TTFMTAESGP PPPQPBVLAT VKESRGETAA 60 

GAGVPGEATG RGAGGRRRKR PLQRGKPPYS YIALIAMAIA HAPERHLTLG GIYKPITERP 120 

PFYRDKPKKH QNSIRBNLTL NDCFLKIPRE AGRPGRGKYH ALDPNAEDMF SSQSPXiRRKK 180 

RPRRSDLSTY PAYMHZ2AAAA AAAAAAAAAA AAAAAIFPGA VPAARPSVPG AVICAGYAPPS 240 

LAAPPPVYYP AASPGPCRVF 6LVPERPLSP BIiGPAPSGFQ GSCAPASAGA PAT TTGlfQ PA 300 

GCTGABFANP SAYAAAYAOP DGAYPQGAOS AZFAAAGRIA QPASPPA638 8CSGVETTVDP 360 
YGRTSPGQPO ALGACyHPOG QUSaASAOAY HAREAAAYP6 GZDRPVSAN 

Seq ID NO: 318 DKA sequence 
Nucleic Acid Accession 8: NM_005688 
Coding sequence! 126.. 4439 

1 11 21 31 41 51 

I I I- 1 I I 

CC33GGCAGGT GGCTCAT6CT 0GGGAQCX3TG GTTQAGCGGC TGGCGOGGTT GTCCTGGAGC 60 

AGGGGCGCAG GAATTCTGAT GTGAAACTAA CAGTCTGTGA GCCCTGGAAC CTCCGCTCAG 120 

AGAAGATGAA GGATATOGAC ATAGOVAAAG AGTATATCAT CCCCAGTOCT GGGTATASAA 180 

GT6TGAG6GA GAGAACCAGC ACTTCTGGGA OGCACAGAGA COGTGAAGAT TCCAA0TTCA 240 

G6AGAACT0G ACCGTTOGAA T6CCAAGATG CCTTGGAAAC AGCAGCC06A 6CC6AG66CC 300 

TCTCTCTTGA T60CICCAT6 CATTCTCAQC TCAGAATCCT GGATXSAOGAG CATCCCAAGG 360 

GAAAGTACCA T C A T UO Cn'G A6TGCTCTGA AGOCCATCOG GAC7ACTTCC AAACAOCAGC 420 

ACXX»GT6GA CAATGCTQ66 Cm ' Ti ' ICC T GTATGACTTT TTOGTGGCTT TCTTCTCTGO 480 
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COOGTUYGUC CCACAAGAA6 GGGGAGCTCT CMrGGAAGA OGTGTO G TCT CTGTCCMGC 540 

AGGAGTCTTC TGAGGTGAAC TX3CAGAAGAC TASA6AGACT 6T06GAA6AA 6A6CTGAAT6 600 

AAGITCQGCC AGACGCTGCT TCCCTGOGAA GGGTTGTGTG GATCTTCTGC OBCACCAGGC 660 

^ TCATCCTGTC CATCGTGTGC CTGATGATCA OGCAGCTCGC TCGCTTCAGT GGACCftGCCT 720 

J TCATGGTGAJV ACACCTCTTG GAGTATACCC AGGOUUAGA GTCTAACCTG aVOTACAGCT 760 

TSTTGTTAGT 6CT6GGCCTC CTOCIGAOGG AAA.TCX3T60G GTCTEGGTOG CTTGCACTGA B40 

CTTGGGCATT GftATTACCGA ACOGGTGTCC GCTTGC6GG& 6GCCATCCTA ACCATGGCAT 900 

TTAAQAA6AT CCTTAAGTTA AAGAACATTA AAGAGAAATC CCTGGGTGAG CTCATCAACA 960 

^ TTTGCrCCAA CGATGGGCAG AGAATSTTTC AGGCAGC3«3C OSTTGGCAGC CTGCTGGCrG 1020 

lU GAGGAOCOGT TGTTGCCATC TTAGGCATGA TTTATAA7GT AATTATTCTG GGACCAACAC 1080 

GCTTCCTGG6 ATCAGCIGTT TTTATOCTCT TTTACCCASC AAT6AT6TTT GCATGAC66C 1140 

TCACAGCATA TTTCAC3GAGA AAAT60GTGG COGCCACaSGA TGAAGGTGTC CAGAAGATGA 1200 

ATGAAGTTCT TACTTACATT AAATTTATCA AAATGTATGC CTGGGTCAAA GCATTTTCTC 1260 

AGAGTGTTCA AAAAATC08C GAGGAGGAGC GTOSGATATT G6AAAAAGCC GGGTACTTCC 1320 

Id A6GGTATCAC TGTG G OTGTG GCTCCCATTG TGGTGGTGAT T6CCA606T6 6TGACCTTCT 13 BO 

CT6TTCATAT GACCCTGGGC TT0GATCT6A CAGCAGCACA GGCTTTCACA GTGGT6ACAG 1440 

TCTTCAATTC CATGACTTTT GCTTTGAAAG TAACACCGTT TTXaGTAAAG TCOCTCTCAG 1500 

AAGCCTCAGT GGCTGTTGAC AGATTTAAGA GTTTGTTTCT AATGGAAGAG GTTCACATGA 1560 

^ TAAAGAACAA ACCAGCCAGT CXnXSVCATCA AGATAGAGAT GAAAAATGCC ACCTTGGCAT 1620 

2U GGGACTCCTC CCACTCCAGT ATCCAGAACT CGCCCAAGCT GACCCOCAAA ATGAAAAAAG 1680 

ACAAGAGGGC TTCCAGGGGC AAGAAAGAGA AGGTGAGGCA GCTGCAGCGC ACTGAGCATC 1740 

AGGCGGTGCT GGCAGAGCAG AAAGGCCACC TCCTCCTOGA CAGTGAOGAG CGGCCCAGTC 1800 

COGAAGAGGA AGAAGGCAAG CACATCCACC TGGGCCACCT GOSCTTACAG AGGACACTGC 1860 

ACAGCATCGA TCTGGAGATC CAAGAGGGTA AACTGGTTGG AATCT6CGGC AGTGTGGGAA 1920 

25 GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGCCAGAT GACQCTTCTA GAQGGCAGCA 1980 

TTGCAATCAG TGGAACCTTC GCTTATGTGG CCCAGCAGGC CTGGATCCTC AATGCTACTC 2040 

TGAGAGACAA CATCCTGTTT GGGAAGGAAT ATGATYSAAGA AA6ATACAAC TCTGTGCTGA 2100 

ACAGCTGCTG CCTGAGGCCT GACXTTGCCCA TTCTTCCCA6 CAGOSACCTG ACGGAGATTG 2160 

GAGAGCGAGG AGCCAACCTG AGOG6TGGGC AGCGCCAGAG GATCAGCCTT GCC0G6GGCT 2220 

3U TGTATAG76A CAGGAGCATC TACATCCTGG ACQAOCXXCT CASTGCCTTA GATGCCCAT6 2280 

TGGGCAACCA CATCTTCAAT AGTGCTATCC GGAAACATCT CAAGTCCAA6 ACAGTTCTGT 2340 

TTGTTACCCA CCAGTTACAG TACCTGGTTG ACTGTGATGA AGTGATCTTC ATGAAAGAGG 2400 

GCTGTAT7AC GGAAAGAGGC ACCCATYSAGG AACTGATGAA TTTAAATGGT GACTAtTGCTA 2460 

CCATTTTTAA TAACCTGTTQ CTGG6AGAGA CACCGCCAGT TGAGATCAAT TC»AAAAAOO 2520 

5j AAACCAGTGG TTCACAGAAG AAGTCACAAO ACAAGGGTCC TAAAACAGGA TCAGTAAAGA 2580 

AGGAAAAAGC AGTAAAGCCA GAGGAAGGGC AGCTTGTGCA GCTGGAAGAG AAA6GGCAGG 2640 

GTTCAGTGCC CTGGTCAGTA TATGGTGTCT ACATCCAGGC TGCTGGGQGC CCCTTGGCAT 2700 

TCCZX^GTTAT TMOGCCCTT TTCAT6CTGA ATGTASGCAG CACCSOCTTC A6CACCT0GT 2760 

GGTTGAGTTA CTGGATCAAG CAAGGAAGOG G6AACA0CAC TGTGACTC6A G6GAAC6A6A 2820 

4U CCTOGGTGAO TGACAGCATG AA6GACAATC CTCATATGCA GTACTATGCC AGCATCTACG 2880 

CCCTCTCCAT GGCAGTCATQ CTGATCCTGA AAGCCATTOS AGGAGTTGTC TTTGTCAAGQ 2940 

GCACGCTGCQ AGCTTCCTCC CGGCTGCATG AOGAOCTTTT CCX3AAGGATC CTTOGRAGCC 3000 

CrATGAAGTT TTTTGACSiaS ACOCCCACAO 0GAG6ATTCT CAACAGGrTT TCCAAAGACA 3060 

TGGATGAAGT TGAOGTGOQG CTGCCGTTOC AGGCOGAGAT GTTCaTCCAO AACGTTATCC 3120 

45 TGGTGTTCTT CTGTGTGGGA ATGATCGCAG GAGTCTTCCC GTGGTTCCTT GTGGCAGTGG 3180 

GGOXCTTGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGGGTCCTG ATTCGGGAGC 3240 

TGAAGCGTCT GGACAATATC ACGCAGTCAC CTTTCCTCTC CCACATCACG TCCAGCATAC 3300 

AGGGOCTTGC CACCATCCAC OCCIACAATA AAGGGCAGGA GmCTGCAC AGATACCAGG 3360 

AGCTGCTGGA TGACAACCAA GCTCCTTTTT TT7TGTTTAC GTGT6GGATG 0GQTG6CTGG 3420 

5U CTGTGCX3GCT GGACCTCATC AGCATOGCCX: TCATC31CCAC CAOGGGGCTG ATGATC3GTTC 3480 

TTATGCAGGG GCAGATTCCC CCAGCCTATG CGGGTCTOGC CATCTCTTAT GCTGTCCAGT 3540 

TAACGGGGCT GTTCCAGTTT AOGGTCAGAC TGGCATCTGA GACAQAA6CT CGATTCACXTT 3600 

CGGTQ6AGA0 GATCAATCAC TACATTAAGA CTCTGTCCTT GGAAOCACCT GCCAGAATTA 3660 

AGAACAAGGC TCCCTCCCCT GACTGGCCOC AGGAGG6A6A GGTGACCTTT GAGAAOGCAG 3720 

55 AGATGAGGTA CCGAGAAAAC CTCCCTCTTG TCCTAAAGAA AGTATCCTTC ACGATCAAAC 3780 

CTAAAGAGAA GATTGGCATT GTGGGGCGGA CAGGATCAGG GAA6TCCTG6 CTGGGGATGG 3840 

CCCTCTTCOG TCTGOTGGAG TTATCTGGAG GCTGCATCAA GATTQATGGA GTGAGAATCA 3900 

GTGATATTGQ CCTTOOOGAC CTCOQAAGCA AACTCTCTAT CATTCCTCAA QAOCOGGTOC 3960 

TGTTCAGTGG CACTGTCAQA TCAAATTTGO ACCCCTTGAA CCAGTACACT 6AAGACCAGA 4020 

OU TTTGGGATGC CCTGGAGAGG ACACACATGA AAGAATGTAT TGCTCAGCTA CXHCTGAAAC 4080 

TTGAATCTGA AGTGATGGAG AATGGGGATA ACTTCTCAGT GGGGGAAOGG CAGOTCTTGT 4140 

6CATAGCTAG AGCCCTGCTC OGCCACTGTA AGATTCTGAT TrTAGATGAA GCCACAGCTG 4200 

CCATGQACAC AGAGACAOAC TTATTQATTC AAGAGACCAT COBAOAAOCA TTTGCAGACT 4260 

GTACCATOCT QACCATTOCC CATOGCCTGC ACACOGTTCT AGQCTCCGAT AGGATTATG6 4320 

05 TGCTGGCCCA GGGACAGGTG GTGGAGTTTG ACACCCCATC GGTOCTTCTG TCCAAOGACA 4380 

GTTCCCGATT CTATGCCATG TTTGCTGCTG CAGAGAACAA GGTCGCTGTC AAGGGCTGAC 4440 

TC CTCCC TGT TGACGAA6TC TCTTTTCTTT AGAGGATTGC CATTCCCTGC CTGGGGOGGG 4500 

CCCCTCATOG CGTCCTCCTA COQ A AAOCyx GOCTTTCTCG ATTTTATCTT TOQCACAGCA 4560 

«^ GTTC06GATT GGCTTGTGTG TTTCACTTTT AGGGAGAGtC ATATTTTGAT TATTGTATTT 4620 

/U ATTCCATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 

GGGAACCGTT ATTATAATTG TATCAGAGGC CTATAATGAA GCTTTATACG TGTAGCTATA 4740 

TCTAXATATA ATTCT6TACA TAGOCTATAT TTACAGTGAA AATGTAAGCT GTTTATTTTA 4800 

TAT mA ATA AGCACXCTGC TAATA ACACT QCA TATT CCT TTCTATCATT TTTGXAGAGT 4860 

_^ TT6CTGTACT AGAGATCTGO TTTTGCTATT AGACTGTAGG AAGAQTAGGA TTTCATTCTT 4920 

/5 CTCTAGCTGG rGGTTTCAOS GTGCCAOGTT TTCTGGQTGT CCAAAGQAAG ACGTGTOGCA 4980 

ATAGTGGGCC CTCCGACAGC CCCCTCTCCC GCCTOCCCAC AGCCGCTCCA GGG6TGGCTG 5040 

GAGACGGGTG GGCGGCTGGA GACCATGCAG AG0GCCX3TGA GTTCTCAGGG CTCCTGCCTT 5100 

CTQTCCTGGT GTCACTTACT OTTTCTGTCA GGAOAGCSVGC GGGGOSAAGC CCAGGCCCCT 5160 

TTTCACTCCC TCCATC3W3A ATGGGGATCA CAQAGACAIT CCTCX33AGCC GGGGAGTTTC 5220 

oU rrrCCTGCCT TCTTCTTTTT GCTGTTCTTT CTAAACAAGA ATCAGTCTAT CCACAGAGAG 5280 

TOXACTGCC TCAGGTTCCT ATGGCTGGCX; ACTGCRCAGA GCTCTCCAGC TCCAAGACCT 5340 

GTTQ6TTCXA AGCCCTGGAG CCAACPGCTG CTTTTTGAGG TGGCACTTTT TCATTTGCCT 5400 

ATTCOCACAC CTCCACAGTT CAGTGGCAQG GCTCAGGATT TC6TGGGTCT GTTTTCCTTT 5460 

CrCACCGCAG T0GTC6CACA GTCTCTCTCT CTCTCTCCCC TCAAAGTCTG CAACTTTAAG 5520 

05 C3W3CTCTTOC TAATCAGTGT CTCACACTQG 06TAGAAGTT TTTCTACTST AAAGAGACCT 5580 

ACCTCAGGTT GCTGGTTGCT GTGTGGTTTG GT6TGTTCCC GCAAACCCCC 'm V A' GC m'i' 5640 

GGGGCTGGTA GCTCAGGTGG GCGTGGTCAC TGCTGTCATC AGTT6AAT0G TCAGCGTTGC 5700 
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ATGTCGTGAC CAACTAQACft TTCTGT06CC TTAGCAT6TT TGCIGAAChC CTTGTOGAA6 5760 

CAAAAATCT6 AAAA1GT6MI TAAAATTATT TTGGATTTTS TAAAAAAAAA AAAAAAAAMl 5620 
AAAAAAAAAA AAAAAAAA 

Seq ID liO« 319 Protein eequence: 
Protein Accession #t NF_005679 

1 11 21 31 41 51 

i I I I 1 I 

MKDIDIGKEY IIPSPGYRSV SBRTSTSGTR RDREDSKFRR TRPIiECJQDAI* BTAARAEQLS 60 

LDASKH5QLR ILDEEHPKGK YHHGLSALKP IRTTSKHQHP VDNAGLFSCM TFSHLSSLAR 120 

VAHKRGELSM EDVW5LSKHE SSDVNCRBLB RLHQEELNEV GPCAASLRRV VHIFCRTRLI 180 

LSIVCLMITQ LAGPSGPAFM VKHLLEYTQA TESNLQYSLL LVIX3LLLTBI VRSWSLALTW 240 

ALNYRTGVRL RGAILTMAPK KILKLKNIKB KSLGBLINIC SNDGQRMPEA AAVGSLLAGG 300 

PWAIL01IY NVIILGPTGF LGSAVPILFV PAMMFASRLT AYFRRKCVAA TDEBVQKMNE 360 

VLTYIKPIKM YASmCAPSQS VQKIREEERR ILEKAGYFOG ITVGVAPIW VIASWTFSV 420 

HMTLGFDLTA AQAPTWTVF USMTFALKVT PPSVKSLSEA SVAVDRPKSL PLMEEVHMIK 480 

KKPASPBZia EMraiATIAWD SSHSSIQKSP XLTPKNQCKDK RASRGKKEKV RQLQRTBHQA 540 

VLAEQRSHUi LDSDERPSPE EEE6KHIHLG HLRIjQRTLHS IDLBIQEGKL VGICX3SVGS6 600 

KTSLISAII/S QMTLLEGSIA ISGTFAYVAQ OAWILNATLR DNILPGKEYD EERYNSVUJS 6 SO 

CCLRPDLAIIi PSSDLTBIGE RGANLS6GQR QRZSItARALY SDRSI YIUD PLSALDABVG 720 

MHZFNSAZRK HLKSKTVIiFV THQU^YLVDC DBVZFMKBGC ZTERGTHEEL KMLNGDYATI 760 

FMNLLL6BTP PVEZKSKXET S6SQKKSQDK GPKIGSVRXB RAVKPEBGQL VQLEEKGQGS 840 

VPWSVYGVYI QAAGGPLAPL VIMALFMLNV GSTAPSTWHL SYWIKQGSGN TTVTR<2JETS 900 

VSDSMKDNPH MQYYASIYAL SMAVMLILKA IRGWFVKGT LRASSRLHDE IiFRRILRSPM 960 

KFFDTTPTGR ILNRPSKDMD EVDVRIiFFQA EMFIC2NVILV FFCVGMIAGV PPWFLVAVGP 1020 

LVZLPSVLHZ VSRVLIRELK BU)NITQSPF LSBZTSSiQG lATZHAYNKS QEFLHRYQEIi 1080 

IiDDNOAPFFZ* FTCAMRWZAV RUILZSIALI TTTGLMZVIM BGQZPPAYAO lAISYAVQLT 1140 

GLPQPTVRLA SETEARFTSV ERINHYIKTL SLBAPARIKN KAPSPDWPQE GEVTPENAEM 1200 

RYRENLPLVL KKVSPTIKPK EKIGIVGRTG SGKSSU04AL FRLVBLSGGC IKIDGVRISD 1260 

IGXjADIjRSKIi SXIPQBPVLF SGTVRSNU7P FNQYTEDQIN DALERTHMKE CIAQLPLKLE 1320 

SSVMBN6DNF SVGBRQLLCZ ARALLRBCKZ LILOBATAAM OTETDUiIQE TIREAFADCT 13 BO 
MLTIAKRLHT VLGSORIMVL AQGQWSFDT PSVZ^LSHDSS RFYAMFAAAB MKVAVKG 



Seq ID NO: 320 DMA sequence 

Nucleic Acid Accession «: AK022089.1 

coding sequence: 181^1488 

1 11 21 31 41 51 

I I I I I I 

A6CA6TT6CA CAACTTCCAO CAACTTTCTC A0C08GCTAC TAATGAGCTG AAAGCCAGGA 60 

ACATOXSAGG AOAAGAGAAA GCTTCCAGGC CTCCTCCCTT CACCCTGGAA ATCCAGACAC 120 

CCXXaCCCCC ACCCTCAGAT CACTTTAAGA TAATTTCTTT ATTCGTTTGC CXX3ACAX3ACC 180 

ATGGCTCCCT TTGGAAGAAA CTTGCTAAAO ACTOGGCATA AAAACAGATC TCCAACTAAA 240 

GACATOGATT CA6AAGAGAA GGAAATTGTG GTTTGGGTTT GCCAAOAAGA 6AAGCTTGTC 300 

TGTOQGCTOA CTAAA06CAC CACCTCTGCT GATGTCATCC AGGCTTT6CT TGAGOAACAT 360 

GAGGCTA06T TTQGAGAQAA AOQATTTCTT CTG6G6AA6C CCAQTGATTA CTGCATCATA 420 

GAGAAGTGGA QAGGCTCCGA AAGGGTTCTT CCTCCACTAA CTAGAATCCT GAAGCTTTGG 480 

AAAGCX3TGGG GAGATGAGCA GCCCAATATG CAATTrGTTT TGGTTAAAGC AGATGCTTTT 540 

CTTCCAOTTC CTTT6TGG06 GACA6CTGAA 6CCAAATTAG TGCAAAACAC AGAAAAATTG 600 

TOOGAGCTCA GGOCftGCAAA CTACATGAAO ACTTTAOCAC CAQATAAACA AAAAAGAATA 660 

GTCAOGAAAA CTTTCCGGAA ACTGGCTAAA ATTAAGCAG6 ACACAGTTTC TCATGATCX3A 720 

GATAATATGG AGACATTAGT TCATCTGATC ATTTCCCAGG ACCATACTAT TCATCAGCAA 780 

GTCAAGAGAA TGAAAGAGCT GGATCTGOAA ATTGAAAAGT GTGAAGCTAA GTTCC ATCT T 840 

GATOGAOTAO AAAATGATGG AGAAAACTAT GTTCAGGATG CATATTTAAT 60CCAGTTTC 900 

A6TGAAGTT6 AGCSVAAATCT AGACTTGCA6 TATGAGGAAA A0CA6ACTCT 6GAG6ACC7G 960 

A6CX3AAAGT6 AT6GAATTGA ACAGCTGGAA GAA0GACT6A AATATTAC06 AATACTCATT 1020 

GATAAGCTCT CTGCTGAAAT AGAAAAAGAG GTAAAAAGTO TTTGCATTGA TATAAATGAA 1080 

6ATGCGGAAQ GGQAACCTOC AAGTGAACTG GAAAGCTCTA ATTTAGA6AG TGTTAAGTGT 1140 

GATTTGGAGA AAAGCATpAA AGCTGGTTTG AAAATTCACT CTCATTTGAO TGGCATCCAG 1200 

AAAQAGATTA AATACAGTGA CTCATTGCTT CAGATGAAAG CAAAAGAATA TQAACTCCIG 1260 

GCCAAGGAAT TCAATTCACT TCACATTAGC AACAAAGATG GGTGCCAGTT AAAGGAAAAC 1320 

AGAGCGAAGG AATCTGAGGT TCCCAGTAQC AATGGGGAGA TTCCTCCCTT TACTCAAAGA 13 BO 

GTATTTAGCA ATTACACAAA TGACACAGAC TOOGACACTG GTATCAGTTC TAACCACAGT 1440 

CAGGACTCCQ AAACAACAGT AGGAGATGTG GTGCTGTTGT CAACATAGTT CCAATGOCTC 1500 

CTTTCTGACC TGCTTTCATG TTTTAATGTT TGTTTAATTT AATAGGAAAC CICATTTTAA 1560 

ATATAACACT CAAAAAAATG TAAATCATAT TGTAGTATTC AATAQTTAAT AAAAACTOGA 1620 
GAAATGTGTT GTTTCTG 

Seq ID NO: 321 Protein sequence x 
Protein Accession #t HP_005438.1 

1 11 21 31 41 51 

t I I I I 1 

MAPFGHNIiLK TRHKNRSPTK DMDSEEKErV VWVCQEEKLV CGLTKRTTSA DVIQALLEEH 60 

EATPQEKRPL-l/SKPSDYCII EKWRGSERVL PPLTRILKUW KAWGDEQPNM QFVLVK ADAF 120 

LPVPIMRTAB AKLVQNTEKL WELSPANYMR TLPPDKQKRI VRKTFRKXJVK IKQDTVSHDR 180 

DNNBTLVHLI ISQDETIHQQ VKRMKELDLE lEKCEAKFBL DRVQTDGEKY VQDAYLMPSF 240 

SEVEQNU3IO YEENQTLEDL SESDGIEQLB BRUCYVRILI DKLSAEIEKE VKSVCIDINB 300 

DAEGSAASEL ESSKLBSVKC DLEKSMKAGL RIBSELS6IQ KEIRYSDSLL OMKAKEYELL 360 

AKEFN8LHIS NKDGCQUCEN RAXBSBVPSS N6BZPPFTQR VFSRrCNDTO SDTGZSSKBS 420 
QDSBTTVODV VI*LST 

Seq ZD IR): 322 DNA sequence 

Nucleic Acid Accession #i NM_030920.1 
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Oodiag eequence: 3X7-1123 

1 11 21 31 41 51 

I I t I I 1 

AGCATTGAAG GGGAAGGAAC T6CGGGTGTG GT6TGTGTAT GT6T6TGTGT ATGTGTGTGC 60 

GG0G0GTG06 TGOQTGTGTG TGOGOGOGCT AGTCIGTGGA CIVAGQAGGTG GGGSCAGCTG 120 

AGTTAGAGTC CCAACTCTTG GACTCCATTT GCTATTCTCT TCTTTCTCCC CCACAOCTAT 180 

CTQC5TGGTGG TAGTGGGCGT TTATATTTGC GTTCCTTTTC ATTCATTTCT AAATCTCTTA 240 

AAAATTTTGG 6TTGGGGGTA TTGGGGAAGG CAGGAAAG6G AAAAGGA6AG TAGTAGCTGA 300 

ASAGCAAlSAG 6AGGACAtV36 AGATGAAGAA GAAGATTAAC CXXXAiGTTAA GGAACAGATC 360 

OOOC3GAG6AG GTGACAGA6T TAGTCCTTQA TAAT70CCTG T6TGTCAAT6 GG6AAATTGA 420 

AGGCCT6AAT GATACTTTCA AAQAACTAGA ATTTCT6A8T ATG6CTAAT0 T6GAACTAAS 480 

TTOGCTGGCC OGGCTTCCCA GCTTAAATAA ACTTOSAAAA TTGGAGCrTA GTGATAATAT 540 

AATTTCTGGA GGCTTGGAAG TCCTGGCAGA GAAATGTCC3V AATCTTACCT ACCTCAATCT 600 

GAGT6GAAAC AAAATAAAAG ATCTCA6TAC AGTAGAAGCT CTGCAAAATC TTAAAAATTT 660 

GAAAA6TCTT 6ACCTGTTTA ACTGTGAOAT CACAAACCTG GMOATTATA GAGAAAQTAT 720 

TTTTGAACTA CTGCAOGAAA TCACATACTT AGATGGATTT GATCAC36AG6 ATAATGAA6C 780 

GCC6GACTCT GAAGAGGAGG ATGATGA6GA TGGAGAT6AA GATGATGAAG AGGAAGAGGA 840 

AAATGAAGCT 60TCCACCGG AA6GATATGA GGAAGAGG;vG GA6GAAGA6G AAQAGGAGGA 900 

TGAGGATGAG GATGAAGATG AAGATGAAGC AG6TTCAGA6 TTGGGAGA6G GASAA6A6GA 960 

AGTGGGCCTC TCATACTTAA T6AAAGAAGA AATTCAGGAT GAAGAAQAIG ATGAT6ACTA 1020 

TGTTGAAGAA QGGGAAGAAG AGGAAGAAGA GGAAGAAOGA GGTCTTOGAG G6GAGAA6AG 1080 

GAAACGAGAT GCTGAAGACG ATGGAGAGGA AGAAGATGAC TAGATC3VTTC TAAGACCAGA 1140 

TTCTCTAATG TTTCTGGGTG TGCAATAGAG TGATCACATC TTTGTTTCTT CATGTAOGAT 1200 

AGCTATOCCT ACAGAAGATA ATGTGTAACT TTTTATAGGA AAAGTGTGGT TTTACTATTT 1260 

TTGCCTTATC ATTCCAAATA AGAACTAGTC TGTTAATGAT CATATTCJTAT GTAGAGAAAA 1320 

ATTTTCATTG ACTCCCATTG TGGAATTCCC TAGCAATTTA TTTAGACTTA ATTTTTTAAA 1380 

TTCAAGCTTA CTGTATTAGT CATTTTTAGC CCATAATTAA AACATGATCA CTTTTAAACA 1440 

GGTGTAGTAT OGTGCATTTC ATTCCTTATT TATAGATTAA CTGAAATTAC ACrTTGCTAT 1500 

AATATAAAAT GACAATAGTC TCTTGAGTGG TAAGTTGGTT ATTTTTTTAG AGGTGATCCA 1560 

GGAATCTTTA GTTTGAAGGC AGTTACCTTT ' iTmTXVlT TXTrmTm ACTAAGAGTG 1620 

TTTGGTTGCT TTTTT6TCAC AAGTAACTTG GAAAATAQAA QCAGAATAGT AAAGGTTCTA 1680 

TTCAGCAAC3V TAGTTCArOS A TOTOTO GA OGTTCT A TTC AGTAAIATOG TTCATOGAT7 1740 

TAGTGGTGAC TGATAAGATT TTATTTTTGA A6GAAAAATT GCTTATACTA ABTCCA6AGA 1600 

CATGCAGGTG AOCXXTTTTTG TCAG6CTGCA AATCATGACA TGCC6ATGGT TGTTTATTTT 1860 

6TTTTTA0GT GTGCATTCTT TTTCTTCTTA GCAATTCCTT TATGATCACC TTCCCTTCTT 1920 

GTTTCACTCC CTCCCGCTCT CTCAAAAGGA ACTTG6GAAA CTTGTGAAAC CCAGQAAAAC 1980 

CTTTAGTCTT ATACCTCAAC TAOGTTTCAG TCCIGTCTQO GTTTTAAATA A0T6AAGTAG 2040 

AAGAAATT6A GT A rmtTI tj ACATAA6AAT ATATTATCAA TACA6TTTTA TGCAGTAAGC 2100 

TCTCCTTACC ATAAATGTTT CTTGGTTGAC AACATCTAAG ACAATATTA6 TGGGATGAAG 2160 

AAAGAAAAGC AGGGGTGCTT TTGGAAGCAG TGTTAGTGTT CCTCAAAAGT C3GGAACAATT 2220 

GCCTGTTQAT ATATTAATAA GACATTAAAG TCAAATTTTA ATGTTGGCCT CTCAAATGAT 2280 

TTGGATACCA CTCIQCAAAG TATTTCTAAC CTTTAATTOC CAGTTTTAAA ACAOATATAA 2340 

TAATACCATT TAATTGGAAT ATACTAOGCA 6CTGGAAAAG TATTT8AAAC TAAATTGACA 2400 

TTAAAATTAA GATTTGTTTT CAA6TGGAT0 TCCATTAAAA 6TAGAAAAAT ATTTGGGATA 2460 

AGTGAGTGTG TGTTTCCTTA CATGGCTACT AAATAAAATA TAATGAGTAT ACAAGTATAT 2520 

CTCCTCTTTT GCTATGQAGG CTCCATGTTC AAGGCAATGG CTTTTTAAAT CTTGGCTATC 2580 

TAAAATTTTT TGCCTTTGTT TTGAATATTT 6TAAGTTTTT AAOAAGTTAO TGTCAGCAAA 2640 

TTAATT6AA6 TTATGCTTCT ATACTGG6AC ATATTTAAAT ACTGAGTATA GTACTGCT6C 2700 

TACTGCTTCT ACAATOTAAA ATGTATGACT TGGTGTTTTA AAGTAAAAAT TATGATGTTA 2760 

CTTGTGGAGA AACTAAAAAT GTTGTACAAC TGACOGAAAG AAAACCCTTG GGGATAAGTT 2820 

TAGTGAQQGG ATTGGAATCC CCAAAAAGAT AACATTTTTC TTCTGCTTTT AAAAACTGAA 2880 

ATTGCCTOTT CTAGTTOCTA ACAATTCTCA TTACATACTA TOCGAaATTA CAAAATACTT 2940 

ATTTTTAAAA TGAAATCTAT ATATTGACTT TCTTATCAAT CATCTTACTG TGCAATCAAA 3000 

ATTAGAGTAC TTTGGTTTGA AAACAACACT TAQAGCCTCC AGATAACTTT TAAGACTTAT 3060 

TTAGCTTTGT GGGTGGTATT TTCATGCAAA TAAGTAAGGG TGGGTTTTAT ATTTTGTAGA 3120 

AGTTTTOOGT CCTATTTTAA TGCTCTTTGT ATGGCAGTAT GTATATATTG TGTTAAGTTC 3180 

CTCAAGAATC 7CCTTAAAAA CTTTGAAGTT AATACrTTTG TGCAACTGIG TTTTGAATAA 3240 
AGCCAT6ACA GTGTTAAAAA CAAAC 

Seq ID NO: 323 Protein sequence: 
Protein Accession ftt NP_1121B2.1 

1 11 21 31 41 51 

1 i t I I I 

MEMKKKINIiE LRNRSPEEVT ELVLDNCLCV NGEISSLZIDT FKEZiE?LSMA NVBLSSLARL 60 

PSLNXLRKLB ItSDNIISGGL EVLAEKCPNL TYUXhSCmi KDLSTVEALQ NLBNIiKSLDL 120 

FMCBXlMIiED YSSSXFBIiLQ QITYLDGFOQ BDNEAFOSEB EDDQXSDEDD EEEEEHBAGP 180 

PEGYEEEBEB EBEEDEDEDE DEDBAGSEL6 EGEEEVGLSY LMKEEXQDEB DSDOYVBEGB 240 
EEESEBEGGJj RGSKRXXtDAE DDGEEEDO 

Seq ID HO: 324 DKA sequence 
Nucleic Acid Accession #: MM_003812 
Coding sequences 224.. 2723 " 

1 11 21 31 41 51 

I I I 1 t I 

TCCTCT60GT CCOGOCCCGG GAGTGGCT6C GAQGCTAOGC GA600GGGAA AGGGGQOGOC 60 

GCCXIAGCCCC GA6CCC0GCG CCXXXTTGCCC OGAGCCOSGA GC ODUa t S CC OGOSGOSGCA 120 

CCATGOGCGC OGAGCOGGCG TGACCGGCTC CGCXXXSCGGC 06CCXXGCAG CTAGCCC66C 180 

GCTCTCGCOG GCCACAOSGA G06GCGCCCG GGAGCTATGA OCCATGAAGC CGCCOGGCAG 240 

CA6CTG606G CAQCOQCCCC TGGOGGGCT6 CAGCCTTGCC GGGGCTTCCT GGG6C00CCA 300 

AGQGG6CCCC GOOOGCtOGG TGCCTGCCAG 0G0CC06GCC CGCA0Q006C CCr GOOG OCT 360 

OCTTCTOGTC CTTCTCCTGC TGCCTCOGCT OGCOGCCTOG TOCOGGCCOC GC60CTG6GQ 420 

GGCTOCTGOG CCCAGOGCTC OGCATTGGAA TGAAACTQCA GAAAAAAATT TOGGAGTCCT 480 

GGCAGATGAA GACAATACAT TGCAACAGAA TA6CAGCAGT AATATCAGTT ACAGCAATGC 540 

AAT6CAGAAA GAAATCACAC TGCCTTCAA6 ACTCATATAT TACATCAACC AAGACT06GA 600 
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AAGCCCTTAT CAOGTTCTTG ACAO^AGGC AAGACACCAG CAAAAACATA ATAAGGCTGT 660 

CCATCTGGCC CAGGCAAGCT TCCAGATTGA A6CCTTCGGC TCCAAATTCA TTCTTCACCT 720 

CATACTGAAC AATGGTTTGT TCTCTTCT6A rCATGTGGAG ATTCACTAC6 AAAATG6GAA 780 

ACCACASTAC TCtAAGGGTG GAGAGCACTG TTACXAOCAT G6AAGCA3CA 6AGGGGTCAA 840 

AGACTCCAAO GTG6CTCT6T CAACCTGCAA TGGACTTCAT OGCATGTTTG AAGATGATAC 900 

CTTCGTGTAT ATGATAGAGC CACTAGAGCT GGTTCATGAT GAGAAAAGCA CAGGTOGACC 960 

ACATATAATC CAGAAAACCT TGGCAGGACA GTATTCTAAG CAAATGAAGA ATCTCACTAT 1020 

GGAAAGA6GT GACCAGT66C CXTTTTCTCTC TGAATTACAO TGOTTGAAAA GAAGGAA6A6 1080 

AGCAGT6AAT CCATCACGT6 GTATATTTGA AGAAATGAAA TATTTGGAAC TTATGATIGT 1140 

TAATGATCAC AAAACGTATA AGAA6CATO0 CTCTTCTCAT GCACATACCA ACAACTTTGC 1200 

AAAGTCCX3TG GTCAACXTTTG TGGATTCTAT TTACAAGGAG CAGCTCAACA CGAGGGrTGT 1260 

CCTGGTGGCT GTAGAGACCT GGACTGAGAA GGATCAGATT GACATCACXA CCAACCCTGT 1320 

GCAGATGCTC CATGAGTTCT CAAAATACOS GCAGCGCATT AAGCAGCATG C TGATGCT GT 1380 

GCACCTCATC TOGGGGGTGA CATTTCACTA TAAGAGAAGC AGTCTGAiSTT ACTTTGGAQQ 1440 

TGTCTGTTCT G6CACAAGA8 GAGTTGGTQT 6AAT6AGTAT 6GTCTTCCAA TG6CA6TGGC 1500 

ACAA6TATTA TOSCAGAGCX: TGGCTC3UM CCTT66AATC CAATGG6AAC CTTCTAGCAG 1560 

AAAGOCAAAA TGTGACTGCA CAGAATOCTQ G6GT6GCTGC ATCAT6GAGG AAACAGGG6T 1620 

OTCCCATTCT OSAAAATTTT CAAAGTGCAG CATTTTGGAG TATAGAGACT TTTTACAGAG 1680 

AGGAGGTGGA GCCTGCCTTT TCAACAGGCC AAC3UVAGCTA TTTGAGCCCA CX3GAATGTGG 1740 

AAATOGATAC GT6GAAGCT6 GGGAGGAGTG TGATTGTGGT TTTCAXGTGG AATGCTAIGG 1800 

ATTATGCTGT AA GAAATGTT OCCTCTGCAA CGGOQCTCftC TGCAOO SAOS GGOOC TGCTG 1860 

TAACAATACC TOITGTCTTT TTCA6CCA06 AGGGTATGAA TGC00GGAT6 CTGTGAACGA 1920 

GTGTGATATT ACTGAATATT GTACTGGA6A CTCT6GTCAG TGCCCACCAA ATCTTCATAA 1980 

GCAAGACGGA TATGCATGCA ATCAAAATCA GGGCOGCTGC TACAATGGOG AGTGCAAGAC 2040 

CAGAGACAAC CAGTGTCAGT ACATCTGGGG AACAAAGGCT GCA6GGTCTG ACAAGTTCTG 2100 

CTAT6AAAA6 CTGAATACAO AAGGCACTGA GAAGG6AAAC T600QQAAGO ATGQAGACOS 2160 

GTGGATTCAG T6CA6CAAAC ATGATGTGTT CT6TG6ATTC TTACTCTGTA OCAATCTTAC 2220 

TCGAGCTCCA CGTATTGGTC AACTTCAOGG TGAGATCATT CCAACTTCCT TCTACCATCA 2280 

AGGCCX3GGTG ATTGACTGCA GTGGTGCCCA TGTAGTTTTA GATGATX3ATA OGGATGTGGG 2340 

CTATGTAGAA GATGGAACGC CATGT66CCC GTCTATGAT6 TGTTTA6ATC GGAAGTGCCT 2400 

ACAAATTCAA GCCC7AAATA TGAGCAGCTG TCCACTOOAT TCCAA008TA AAOTCTGTTC 2460 

G6GCCATGG6 GT6TGTA6TA AT6AAGCCAC CTGCATTTGT GATTTCACCT 6GGCAQ6GAC 2520 

AGATTGCAGT ATCOGGGATC CAGTTAGGAA CCTTCACOX CCCAAGGAT6 AA6GACCCAA 2580 

GGGTCCTAQT GCCACCAATC TCATAATAGG CTCCATOSCr GGTGCCATOC TGGTAGCAGC 2640 

TATTGTCCTT GGGGGCACA6 GCTGGGGATT TAAAAATXTTC AAGAAGAGAA GGTT06ATCC 2700 

TACTCA6CAA GGCCCCATCT GAATCA6CTQ CGCTQQATGG AGACOGCCTT OCACTGTTQG 2760 

ATTCT6GGTA TGACATACTC GCAGCAGTGT TACT6QAACT ATTAAGTTTO TAAACAAAAC 2820 

CTT7GGGT6G TAATGACTAC GGAGCTAAAG TTGGGGTGAC AAGGATGGGG TAAAAGAAAA 2880 

CTGTCTCTTT TGGAAATAAT GTCAAAGAAC ACCTTTCACC ACCTGTCAGT AAACX3GGGGA 2940 

GG66GCAAAA GACCAT6CTA TAAAAAGAAC TGTTCCAGAA TCTTTTTTTT TCCCTAATGO 3000 
A0QAAG6AAC AACACACACA CftAAAATTAA ATOGAAXAAA GGAATCATTA AAAA 

Seq ID NO I 325 Protein sequence t 
Protein Accession ft: liP_003eo3 

1 11 21 31 41 51 

i I 1 I I I 

KKPPGSSSRQ PPLAGCSLAG ASCTGPQRGPA GSVPASAPAR TFPCRLIiLVL LLLPPLAASS 60 

RPRAHGAAAP SAPHWNETAB BNLGVUU9ED NTLQQNSS5N ISYSNAMQKE ITLPSHLIYY 120 

ZNQDSBSPYH VLDTKAHHQQ KHNKAVHLAQ ASFQIBAP6S KFILDLILNK GLLSSDYVBI 180 

ayaiGKPQyS KGGBHCYYHG SIRGVKDSXV ALSTOIGLRO MFSDDTFVyM lEPIiELVBDB 240 

RSTGRPBIIQ KTIAGQYSKQ KKNLTMBRGD QWPFI.SEU2H LKRRKRAVKP SROXFE04Ky 300 

LELMIVNDHK TYKKHRSSHA HTNNFAKSW NLVDSIYKBQ LNTRWLVAV BTWTEKDQID 360 

ITTOPVQMLH EFSKyRQRIK CJHADAVHLIS RVTPHYKRSS LSYFGGVCSR TRGVGVNEYG 420 

LPMAVAQVLS QSLAONLGZO WEPSSRKPKC DCTESHOGCZ MEETGVSHSR KFSKCSILEY 480 

RDPLQRGGGA CLFKRPTKLF EPTECGNGW EA(SBCDOSF HVEOrGLCCK KCSLSNGABC 540 

SDGPCX33NTS CLPQPRGYEC RDAWECDIT BYCT GD SG Q C PPNLRRQDGY AO^QNQGRCy 600 

NGECKTRDNQ OQYIWGTKAA GSDKFCYEKL NTEGTEK^C GKDGDRHIQC SKHDVFCGFL 660 

LCTNLTRAPR IGQLQGEIIP TSPTOIQGRVI DCSGAHWLD DDTDVGYVED GTPCXSPSMMC 720 

LDRKCLQXQA LNHSSCPLOS KQKVCSGHGV CS2IBATCZCD FTWAGTDCSI RDPVRHI£PP 780 
KDB8PXQPSA TNLIIQSZAO AILVAAZVLG GTGNGFKIIVK KRRFDPTQQG PI 

Seq ID UOt 326 SNA sequence 

Nucleic Acid Accession S: AK074418.1 

Coding sequence: 244-1515 

1 11 21 31 41 51 

111)11 

CTTTCTCCAA GACGGCOGGC CATGCTCTCC TCCTCTGCCA GTCTCCTCCA CCACTCTCTA 60 

ACCTGAQAGC CTGTGGAACC TGCCCGTCTC CCCTCCTCCA TCAGACACAC CTGCCTAGGA 120 

AACAGATOGA AAAAGTGAGG GACOGGTGAG TGACTTCCTG CTAAA6TTTA TACCAGAT6C 180 

AAATGACAGA GCTGGAGTTC TGCTGTGCCT GGAAAGGACC TCGGAAGTCT TCTAA6GAGA 240 

GTCATGG06T ATTACCAGGA GCCTTCAGTG GAGACCTCCA TCATCAAGTT CAAAGACCAO 300 

6ACTTTACCA CCTTGCGGGA TCACTGCCTG AGCATQGGCC GGAOffTTTAA GGATGAGACA 360 

TTCCCCBCAG CAGATTCTTC CATAGGCCAQ AAGCTGCTCC AGGAAAAAOG CCTCTCCAAT 420 

GTQATATGGA AQCGGCCACA QQATCTACCA GGGGGTCCTC CTOICTTCAT OCTGGATGAT 460 

ATAAGCW3AT TTGACATCCA ACAAGGAGOC GCAGCTGACT GCTGGTTCCT GGCAGCACTG 540 

GGATOCTTGA CTOUSAAOCC ACAGTACAGG CAGAAGATCC TGATGGTCCA AAGCTTTTCA 600 

CACCAGTATG CTG6CATTTT COGTTTCOGG TTCTQ6CAAT GTGGCCAGTO GGTQGAAGTG 660 

GTGATTGATG A0C3SCCTACC TGTCCAQGQA 6ATAAATGCC TCTTT G '1X 30 G TCCTOGCCAC 720 

CAAAACCAAG AGTTCTGGCC CTOOCTGCTG GAGAAGGCCT ATGCCAAGCT GCTCX3GATCC 780 

TATTCCX3ATC TGCACTATGG CT T CCTCGAG GATGCCCTGG TGGACCTCAC AGGAGGOGTG 840 

ATCAOCAACA TOATCXGCA CTCTTCCCCT GT6GACCTGG TGAAGGCAGT GAAGACAGOB 900 

ACCAAGGCA6 GCTCCCTGAT AAOCTOT G CC ACTCCAAGTG G6CCAACAGA TACA6CACAG 960 

GOGATGGAGA ATGGGCTGGT GAGTCTCCAT 6CCXACACTQ T6ACTGGGGC TGAGCACATT 1020 

CAATAOOGAA GGG6CTGGGA AGAAATTATC TCCCTGTGGA KCCCCTQGC3G CTGGGGCGAG 1080 

A0C3GAATGGA GAGQ608CTG GAGTGAT6GG TCTCAGGAST GGGAGGAAAC CTCTGATCOS 1140 
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OGGAAAAGCC AGCTACATAA GAAACXXMAA GATGGCGAGT TTTGGATGTC GTGTCAAGAT 1200 

TTCCAACAGA AATTCATCGC CATGTTTATA TGTAGCGAAA TTCCAATTAC CCTGGACCAT 1260 

GGAAACACAC TCCACXSAAGG ATGGTCCXIAA ATAATGrTTA GGAA6CAAGT GATTCTAGGA 1320 

AACACTGCAG GAQGACCTCG GAATGATGCT CAATTCAACT TCTCTGTGCA AGAGCCAATG 1380 

QAAGGCACXA ATGTTGTCST GTGOGTCACA GTTGCTGTCA CACCATCAAA TTTQAAAGCA 1440 

QAA6ATGCAA AATTTCCACT CGATTTCCAA GTGATTCTGG CTGGCTCACA GAAACACTGT 1500 

CCAAAOCTCA AATAATAAAT TCCX3CCGCAA CTTCACCATG ACTTACCATC TGAGCCCTGG 1560 

GAACTATGTT GTGGTTGCAC A6ACACGQAG AAAATCAGOO GAGTTCTTGC TCOGAATCTT 1620 

CCTGAAAATG CCAGACAGTG ACAGGCACCT GAGCAGCCAT TTCAACCTCA GAATGAAGGG 1680 

AAGCCCTTCA GAACATGGCT CCCAACAAAG CATTTTCAAC AGATATGCTC AGCAGGTAT6 1740 

GTACCTAGCA 00CAG6GGCC TTAOGTGGGA TTGGA6AAAG GGGACCTGAG GQAGGGACAO 1800 

CCCTCACAG6 CCCTTACTGG GATGCAGAGA GGAGAAGTGA CTT6ATGGAC TATrTTACXrr 1860 

GCCTCTCTTC CTGGATaSTC TCCAGAACTG CTGTGGCTGC CAAGCTCGGT AQAGACXTrCG 1920 

06CCCCACCC AGTCTCATCC GGGGGACTTC AAGCTGGAAT GCAGAGCTTA GAAAGGGAGG 1980 

GGAIAATTAT GGGGTGTGA6 GTGCATTGCC CTCIAAATCT TTAAACAAGC AATTGGCAGT 2040 

ACCCOGFIQAA ACCmCCXT GTCCTACXOG- GCCACCTOCC ACCAACCTGG CATGSITCCT 2100 

CC06G6AGCT AGCCAGCTTC AGAAA6CACA TACAGCATCC TTCCTG O C A A ACCACXrXATG 2160 

TGCACACAGG ATTTCCTTAA TGGCTTAA7A AACTGTXATA AAGAACTOCT TQACTTSTCA 2220 

GAATAAAATA GCTGCCAGGG GCTCTGCACA AT6A6CCTCT TAOOGTTAAA AAAAAAAAAA 22 BO 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

Seq ID KOs 327 Protein sequences 
Protein Accession #: BAB65075.1 

1 11 21 31 41 SI 

1 I 1 I 1 I 

HAyyQEPSVB TSIIKFKDQD FTTIADRCLS MSRTPKDBTF PAADSSIGQK LLQSRRLSNV 60 

I\fKRPQDLPG GPPHFIU)DI SRFDIQQGGA ADCWFLAAXiG SLTQKPQYRQ KILKVQSFSH 120 

QYAGIFRFRF WQCGQWVEW ZPORLPVQGD KCLFVRPSEQ MQEFWPCIiTiE KAYAKLIiGSY 180 

8DUIYGFLED ALVDLTGGVI TKIHUISSFV DLVKAVKTAT KAGSLITCAT P56PTDTAQA 240 

HBN6LVSLEA YTVTGABQIQ YRRSHEEIIS LWPtfGWG&T SWRGRHSDGS QEWEETCDPR 300 

XSQIHKKRED GEFWMSOQD? QQKFIAIfFXC SBXPXTLDBQ NTLBEGWSQI KFRiQQfVXIiGtl 360 

TA0GP5NDAQ FNFSVQSPHE GTNVWCVTV AVTPSNLKAE DAKFPLOFQfV ILAGSQKHCP 420 
KLK 



Seq ID NO: 328 DNA sequence 

nucleic Acid AccesBion ftt BC017490.1 

Coding sequence: 74-2788 

1 11 21 31 41 51 

i I I I 1 I 

GTGGGTCAOG TGAACCACTT TTO0C3GOQAA ACCTGGTTGT TGCTGTAGTG GCGGAGAGGA 60 

TCGTGGTACT GCTATGGOGG AATCATCGGA ATCCTTCACC ATGGCATCCA GCCCGGCCCA 120 

6CGTCG6CGA GGCAAT6ATC CTCTCACCTC GAGGCXHtSGC CGAAGCTCCC GGOGTACTOA 180 

TQCOCTCAOC TOCAGOOCTQ GOOCSTOACCT TCCACCATTT GAfiGATQAOT COQAGGGGCT 240 

CCTAGGCAak GAGG6GCCCC TG6AGQAAC3A AGAOaAIXSOA GAG6AGCTCA TTG6AGATGG 300 

CATGGAAA6G GACTACOGCXS CCATCCCAGA GCTGGACGCC TATGAGGCX3G AGGGACTGGC 360 

TCTGGATGAT GAGGAOGTAG AGGAGCTGAC GGCCAGTCAG AGGGAGGCA6 CAGAGOGGGC 420 

CATGOGGCAG C6TGAC0GGG A6GCTGGCCG 66G0CTGGGC CGCATQCGCC GTGGGCTCCT 480 

GTATGACA6C GATGAGGAGG A0GAGGAGC6 COCIGCCCGC AA608C0GCC AOGTGGAGCG 540 

GGCCAOSGAG GACGGOGAGG AGGAG6AGGA 6AT6AT06AG AGCAT0SA6A ACCT6GA6GA 600 

TCTCAAAGGC CACTCTGTGC GCGAGTGGGT GA6CATGGCG GGCCCCC3GGC TGGAGATCCA 660 

CCACOGCTTC AAGAACTTCC TGOGCACTCA OGTOSACAGC CACGGCCACA ACGTCTTCAA 720 

G6AGC6CATC A6CGACATGT GCAAAGAGAA CCGIGAGA6C CT6GTG6TGA ACTATGAG6A 780 

CTTGGCA6CC AGGOAGCAOG TGCTGGCCTA CTTQCTGGCT GAGQGAGGSG GGSAGCTGCT 840 

GCAGATCTTT GATGA66CTG CCCTG6AGGT GGTACTG6CC AT6TACCCCA AGTACX3ACC38 900 

CATCACCAAC CACATCCATG TOCGCATCTC CCACCTGCCT CTGGTGGAGG AGCTGCGCTC 960 

GCTGAGGCAG CTGCATCTGA ACXaOCTQAT COOCACCAGT GGGGTGGTGA CCAGCTGCAC 1020 

TGGOGTCCTG CCCCAGCTCA GCATOGTCAA GTAOUVCTGC AACAAGTGCA ATTTCGTCCT 1080 

GGGTCCTTTC TGCCAGTCCC AGAACCAGGA GGTGAAACCA GGCTCCPGTC CTGAGTGCCA 1140 

GTCX5GCCX30C CCCTTTGAGG TCAACATGGA GGAGAOCATC TATCAGAACT ACCAGCGTAT 1200 

CCGAATCCAG GAGAGTCCAG GCAAAGTGGC GGCTGGCCGG CTGCCCOGCT CCAAGGAOGC 1260 

CATTCTCCTC GCAGATCTGG TGGACAGCTG CAAGCCAGGA GA0GA6ATAG AGCTGACTGG 1320 

CATCTATCAC AACAACTATG ATGGCTCCCT CAACACTGCC AATGGCTTCC CTGTCTTTGC 1380 

CACTGTCATC CTAGCCAACC AOGTGGCCAA GAAGGACAAC AAGGTTGCTG TAGGGGAACT 1440 

GACCGATGAA GATGTGAAGA TGATCACTAG CCTCTtXAAG GATCAGCAGA TCGGAGAGAA 1500 

GATCTTTGCC AGCATTGCTC CTTCCATCTA TGGTCA7X3AA GACATCAAGA GAGOCCTGGC 1560 

TCTGGCCCTG TTCGGAGGGG AGCXCAAAAA CCCAGGTGGC AAGCACAAGQ TACGTGGTGA 1620 

TATCAACGTG CTCTTGTGCG GAGACCCTGO CACaGOSAAG TCGCAGTTTC TCAAGTATAT 1680 

TGAGAAAGTG TCCAGCXX3AG CCATCTTCAC CACTGGCCAG GGGGCGTOOG CTGTGGGCCT 1740 

CACGGOGTAT GTCCAGOGQC ACCCTGTCAG CAGGGAGTGG ACCTTGGAG6 CTGGGGCCCT 1800 

GGTTCTGGCT GACCGAGGAG TGTGTCTCAT TGATGAATTT GACAAGATGA ATGACCAG6A 1860 

CAGAACCAGC ATCCATGAGG CCATGGAGCA ACAGAGCATC TCCATCTOGA AGQCXX30CAT 1920 

CGTCACCTCC CTGCAGGCTC GCTQCAOGGT CATTGCTQCC GCCAACCCCA TAGGAGGGOG 1980 

CTACX5ACCCC TCX3CTGACTT TCTCTGAGAA CGTGGACCTC ACAGAGCOCA TCATCTCAOG 2040 

CTTTGACATC CTGTGTGTGG TGAGGGACAC CGTGGACCCA GTCCaGGAOS AGATGCTGGC 2100 

CC GCri'CGTG GTGGGCAGCC ACGTCAGACA CCACCCCAGC AACAAGGAGG AGGAGGGGCT 2160 

GGCCAATGGC AGCGCTGCTG AGCCOGCCAT QCCCAACAOS TATGGCGTGG AGCCCCTGCC 2220 

CCAG6AGGTC CTGAAGAAGT ACATCATCTA CGCCAAGGAG AGGGTCCACX: OGAAGCTCAA 2280 

CCAGATGGAC CAGGACAAGG TGGCCAAGAT GTACAGTGAC CTGAGGAAAG AATCTATGGC 2340 

GACAGGCA6C ATCCCCATTA CGGTGCGGCA CAT0GA6TCC ATGATCC6CA TGGCGGAGGC 2400 

CCA0GG60GC ATCCATCTGC 66GACXATGT GATCXSAAGAC GAOGTCAACA TGGCCATCOQ 2460 

06T6ATGCTG GAGAGCTTCA TA6ACACACA 6AAGTTCAGC GTCATGOGCA QCATGOGCAA 2520 

GACmTGCC CGCTACCTTT CATTCCX3G0G TQACAACAAT GAOCTGTTGC TCTTCATACT 2580 

GAA6CAGTTA 6TGGCAGAGC AGGTGACATA TCAGCGCAAC OGCTTTGGGG CCCAGCAGGA 2640 

CACTATTGAG GTGOCTGAGA AGGACTTGGT GGATAAGGCT GGTCAGATCA ACATOCACAA 2700 
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CCTCTCTGCa TTTTATGACA GTGAGCTCTT CAGGATGAAC 
AAGGAAAATG ATCCTGCAGC AGTTCTGAGG CCCTATGCCA 
TTCTGGTTTG GGGT6GTCA6 TSOCCTCT G T GCTTTAT6GA 
TGAACTOGGG 6TACTAG6GT CAGQ6CTTAT AQCAGGftXGT 
TGTTTGTTTC TOCAAG0CT6 CmXS ' mC TT CTCAOCTTTG 
TGTCTTACTT GGTTGCTGAA CATCTTGCCA CCTCOGAGTG 
TTGGATCAGA GCTGCTGAGT TCAGGATGCC TGOGTGTGGT 
ATGGATGTCA GGAGAGCTGC TGCCCTCTTG GCGTGAGTOXS 
TGCCTTTGGC CAGAGAGCTQ GTTGAAGATG TTTGTAATCG 
TCTGTGCCXX: TGTGGTGGAA GAGGGCAOGA CAGTGCCAGC 
GTCGCAGGGG TGGGATGT6A GTCATGCX3GA TTATCCACTC 
TTGCTOOCTG TCTGTTTCCC CACTCTCTTA TTTGTGCATT 
TAATTTTTAA TAAAGTTGAA TAAAATATAA AAAAAAAAAA 

Seq ID HOi 329 Protein sequence: 
Protein Accession #t AAHl 7490.1 



MABSSBSF-m 



DREAf^GUSR 
SVREWVSMAG 
EHVIAYFLPE 
HUlQIilRTSG 
FEVNKEETIY 
HYDGSLNTAN 
lAPSlYGHED 
SRAIFTTGQG 
HBAi4EQQSXS 
CWRDTVDPV 
KKYZIYAKER 
HLROYVZEDD 
AEQVTYQRim 
LQQF 



11 

I 

ASSPAQRRRG 
ELI(UX94RRD 
MRRGIiLYDSD 
PRIiBZHHRFK 
APAELI^IFD 
WTSCT6VLP 
QHYQRIRIQE 
GPPVFATVIL 
IKRGIiALALF 
ASAVGLTAYV 
ISKAGXVTSL 
QDSOARFW 
VHPKUIQMDQ 
VNMAIRVMIiB 
FGAQODTZEV 



21 



NDFIiTSSPGR 
YRAIPBLOAY 



KFLRTHVD5E 
EAALEWLAM 
QLSMVKYKOf 
SPGKVAAGRL 
ANHVAKKDNK 
GGEPIQ7PGGK 
QRHPVSREIH' 
QARCTVIAAA 



DJCVATOfYSDIt 
SFZDTQKPSV 
PEKDLVDKAR 
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1 

SSKR7DALTS 
BAB8LAZJ3DB 
KRQVERATED 
6RNVFKERIS 
YPKYDRITOH 
KCNPVLGPFC 
PRSKSAILIA 
VAVGELTDED 
EXVRGDIKVL 
LEAGALVLAD 
NPIGGRYDPS 
KEEEOLAHGS 
RKBSMAT6SI 
MRSMBKTFAR 
QZNZHNLSAF 



AAGTTCAGCC 
TCCATAAGGA 
CACAAAACCA 
CTGGCT GC RC 
GGTG66A3GC 
CTTTGTCTCC 
TTAGGTGTTA 
CGTATTCAGG 
TTTTCAGTCT 
GCAGCX3TTCT 
GCCACAGTTA 
CGGTTTGGTT 
AAAAAA 



41 

I 

SPGRDLPPFE 
OVBBLTASQR 



DNCKENRESL 
IHVRISHLFL 
QSQNQEVKPG 
DIiVDSCKPGD 
VKMITSLSKD 
LCGDPGTAKS 
RGVCLIDSFD 
LTFSEMVDLT 
AAEPANFNTY 
PZTVRRIESM 
YLSFRRDNNS 



AOGACCTGAA 
TTCCTTGGGA 
6A6CACTTGA 
C7GGCATGAC 
CTTG0C3VST6 
ACTCAGTACC 
GCCTTCTTAC 
CTGCTTTTGC 
CCTGCAGGTT 
GGGCTOCTCA 
TCAGCTGCCA 
TCTGTAGTTT 



51 
I 

DSSEGTjLGTE 
EAAERAHRQR 
lENLEDLKSH 
WNYEDIAAR 
VCELRSLRQL 
SCPECQSAGP 
BXEI.TGIYHH 
QQZ6EKIPAS 
QPLKYIEKVS 
KMJJDQDRTSI 
EPZISRFDIL 
GVEPLPQBVIt 
ZRMABARARZ 
LUiFZLKQLV 
FSBDLKRXMZ 



Seq ZD HO: 330 DMA sequence 
Nucleic Acid Accession #i M17254 
coding sequence: 257-1645 



GTCCGOGOGT 
CX^GGTCGCAC 
CTTTGGAGAC 
AMGATOGCA 
TGGCTTACTG 
CTTATCAGTT 
GGCTAAGACA 
CCCA0606TC 
06AATGTAAC 
CAAAG60GG6 
GGAGGAGAAG 
AGCAGATCCT 
AGAATATGGC 
GTGCAAGATG 
TCTCTCACAT 
TGATAAAGCC 
TGAGCCCCCC 
TQCTCAAOCA 
TTATCAGATT 
GCTTTOGCAG 
GGAAG6CACC 
AGAGCGGAAG 
CTATGACAAQ 
CCACGOGATC 
CTCAGACCTC 

GccxxACCxrr 

CTGGAATTCA 
TTCTCATCTG 
CACCAOCCCA 
TGAAAAAAGC 
GA6GGAGTTA 
66ACATATCA 
AAGGACAAAG 
AATCCCACTA 
AACATACOGT 
TCAAAAACAA 
ACTGCATGGC 
CAGCTTTCTC 
ACTATGAACT 
AGGGGTGAAG 
TCTCAA6CAA 
ACT0GAGG6T 
GTAATGGAiGA 
TGTCAAATGA 
TCATTATGTG 



11 
I 

GTCaSCGCCC 
TAACTCCCTC 
CCGAGGAAAG 
GAACCAAGGG 
AAG6ACAT0A 
6TGAGTGAGG 
GAGATGACCG 
CCTCAiGCAGG 
CCTAGCCAGS 
AAGATGGTGG 
CACATGCCAC 
AOGCTATGGA 
CTTCCAGACG 
ACCAAOGACG 
CTCCACTACC 
TTACAAAACT 
AGGAGATCAG 
TCTCCTTCCA 
CTTGGACCAA 
TTCCTCCTGG 
AACGGGGAGT 
AGCAAACCCA 
AACATCATGA 
GCCCAGGCCC 
CCGTACATGO 
CCAGCCCTCC 
CCAACTGGGG 
GGCACTTACT 
TCOCCAGAAA 
TTTACTGOGG 
CTGAAGTCTT 
TCTGTGGACT 
TGCCAAAGAA 
ATGCAAACTG 
TTAXAAT6CC 
GAGAAAAGAC 
ATGTGCTGTT 
AAACTGTGAA 
AAAAGGTGGG 
AAGGAGOAGG 
TOAAGACTGG 
TCATGCAGTC 
AAGGGAAGTA 
AAATTTTAAC 
GGGQCTTTQT 



21 
I 

GC6TGTGCCA 
GGCGCCGAOG 
CCGTGTTGAC 
CAACTAAA6C 
TTCAGACTGT 
ACXZAGTCGTT 
CGTCCTCCTC 
ATTGGCTGTC 
TGAATGOCTC 
GCAGCCCAGA 
CCCCAAACAT 
GTACAGACCA 
TCAACATCTT 
ACTTCCAGAS 
TCAGAGAGAC 
CTCCAOQGTT 
CCTGGAC060 
CAGTGCCCAA 
CAAGTAGCC6 
AGCTCCTGTC 
TCAAGATGAC 
ACATGAACTA 
CCAAGGTCCA 
TCCAGCCCCA 
GCTCCTATCA 
CGGTGACATC 
GTATATACCC 
ACTAAAGACC 
CTCTATCGGA 
CTGGGGAAG6 
ACT ACAGAA A 
GACCTTGTAA 
AGTGGTCTTA 
GGATGAAACT 
ATTTTAAGGA 
AOGAGAGAGA 
TTGGTTGAAA 
GATGACCCAA 
ACTGAGGATQ 
AAGAGGCAGA 
ACTCAGQACA 
AGTGTTATAC 
GTAGAATTOV 
7GGAATTGTC 
TCTCCACAGG 



31 

I 

GOGCXSCGTGC 
GCGGCGCTAA 
CAAAAGCAAG 
CGTCAGOTTC 
CCCX3GACCCA 
GTTTGAGTGT 
CAGCGACTAT 
TCfACCCCCh 
AAOGAACTCT 
CACGGTTOGG 
GACCACX3AAC 
TGTGCX3GCAG 
6TTATTCCAG 
6CTCACCCCC 
TCCTCTTCCA 
AATGCATGCT 
TCACGGCCAC 
AACTGAAGAC 
CCTTGCAAAT 
GGACAGCTCC 
GGATCCCGAC 
CX3ATAAGCTC 
TGGGAAGOGC 
CCCCCCX3GAG 
08CCCACCCA 
TTCCAGTTTT 
CAACACTAQO 
TGGCGGAGGC 
GAACATGAAT 
AAGCCGGGGA 
TGAGGAGGAT 
AAGACAGTGT 
AGAAATGTAT 
AAAGCAATAG 
AAACTACCTG 
CT6T6GCCCA 
TCAAATAGAT 
AGTTTCCAAC 
TGTATAGAGT 
GAAGGAG6AG 
TTT6GG6ACI 
CAAAGCOUST 
GAAACAAAAA 
TGATATTTAA 
GTCAGGTAAG 




ACAAATGACT 
TOAACaOCTO 
GCAOCTCATA 
GCCTAOGGAA 
GGACAGACTT 
GCCAGG6TCA 
CCTGATQAAT 
ATGAACTAOG 
GAGCX3CAGAG 
TGGCTGGAGT 
AACATOGATG 
AGCTACAACO 
CATTTGACTT 
AGAAACACAG 
CCCAOSCCCC 
CAGCGTCCTC 
CXAGGOUSTG 
AACrOGAGCT 
GAOGTQQOCC 
A3CC3GCGOCX: 
TAOGCCTACA 
TCATCTCTGT 
CAGAAGATGA 
TTTGCTOCCC 
CTCCCCACCA 
TTTfCCCATC 
CAAAAGTGCC 
A6AGATCCAA 
6CTAAAAATG 
ATG TAGAA GC 
AAACTTTAGA 
AAACAACACA 
TATTTAAAAA 
TCAACA6ACG 
TCOGTTTQAT 
TCCTTTACAG 
GAG03TGTGA 
AODIGGCTGG 
GTGTACAAIQ 
GTTAGGAGAA 
TGOGCATCTC 
GAGAAACATT 
AGATGGCCTT 



CGOGCOGAGC 
ATTCCAGGAT 
CACAGAGAAA 
GTAGATGGQC 
TCAAGGAAOC 
CGCCACACCT 
CCAAGATGAG 

cx:atcaaaat 

GCA6T0TGGC 
GCAGCTACAT 
TTATCGTGCC 
GGG06GT6AA 
GGAAG6AACT 
COSACATOCT 
CAGATGATGT 
ATTTACCATA 
AGTCGAAAGC 
AGTTAGATCC 
6CCAGAT0CA 
GCATCACCTG 
GGCGCTGGGG 
TCCGTTACTA 
AGTTOGACTT 
ACAAGTAC3CC 
ACTTTGTGGC 
CAAACCCATA 
GCCATATGCC 
AGCGTGCATT 
TCAAGAGGAA 
AGACTCTTGG 
TCACGAATAT 
AT6AA6TCTT 
GTAGAGTTTG 
GTTTTGACCT 
TAGTTTCATA 
TT6ATATGCA 
GGACAGCTGT 
TATTACC6GG 
TTGTAGACAG 
GAAAGAAACT 
AGTTAX6QAG 
ACGACACAGC 
TITC r n ' G TT 
CAGGACCTCA 
CTTG6CTGCC 



2760 
2820 
2880 
2940 
3000 
3060 
3120 
3160 
3240 
3300 
3360 
3420 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



60 
120 
160 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 ' 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
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ACAATCA6AA ATCAOGCAGO CKimOBGT AG00G60CTC aVGTTTTCCT TT6AGT0606 2760 

AAC6CTGTGC GTTTGTCAGA ATGAAGTATA CAAiGTCAATO r rmmXX: TTTTTATATA 2820 

ATAATTATAT AACITATGCA TTTATACACT AOGAGTTGAT CTOGGCX^GC CAAAGACACA 2880 

06ACAAAAGA GACAATCGAT ATAATGTGGC CTTGAATTTT AACTCTGTAT GCTTAATGTT 2940 

TACAATATQA AGTTATTAGT TCTTAGAATG CAGAAT6TAT GTAATAAAAT AAGCITGGCC 3000 

TAGCATGGCA AATCAGATTT ATACAGGAGT CTGCATTTGC ACTTTTTTTA GTGACTAAAG 3060 

TT6CTTAAT6 AAAACAT6TO CKSAAIGm TGGATTTTGT GTTATAATTT ACTTTGTCCA 3120 
GGAACTTGT6 CAAGG6AGAG CCAAGGAAAT AGGATCmG GCACOC 

Seq ZD NO: 331 Protein sequence 
Protein Accession #t AAA52398 

1 11 21 31 41 51 

I I i I I I 

MZQTVPDPAA HIKEALSWS EDQSLFECAY GTPHLAICra4 TASSSSDYGQ TSKMSPRVPQ 60 

QDNZiSQPPAR VTZKMECNPS QfVNGSRNSPD ECSVAKQ6RN VGSPDTVGMN YGSYMEEKEM 120 

PPPHMTTHBR RVZVPADPTL USTDHVRQHL BHAVKEYGLP DVHILLFONZ DGKEZXXMTR 180 

DDFQRLTPSY KADILLSHUJ YLRBTPLPHL TSDDVDKALQ NSPRLMHARN TDLPYEPPRR 240 

SAWTGHGHPT PQSKAAQPSP STVPKTEDQR PQLDPYQZU3 PTSSRLANPG SGQIQIjWQPL 300 

LELLSDSSKS SCZTWBQTKG EFKMTDPDEV ARRH6ERKSK PMMSIYDKLSR ALRYYYDKNZ 360 

MTKVHGKRYA YKFDFHGZAQ ALQPHPPBSS LYKYPSOLPy MGSYHAHPQK HHFVAPHPPA 420 

LPVTSSSFFA APin>Yf4NSPT GGZYPNTRLP TSHMPSSL6T YY 462 

Seq ID HO: 332 mh Sequence 
Nucleic Acid Accession fi: HM_000020 
Coding sequence I 263-1794 

1 li ' 21 31 41 ■ 51 

I t I I I I 

AGGAAAGG6T TTATTAGGAG GCAGTGGTGG A6CT6GQCCA GGCAGGAAGA 06CTGGAATA 60 

AGAAACATTT TTGCTCCAGC CCCCATCCCA GTCCCGGGAG GCTGCCGCGC CAGCTGCGCC 120 

GAGOGAGCCC CTCCCX3G0CT CCAGCCCGGT CCGGGGCOGC GCCGGACCCC AGCCCGCCGT 180 

CCAQOSCTQG OGGTOCAACT GCGGCCGCGC GGTGGAGGGG AGGTGGCCCC GGTCXX3CCGA 240 

AOGCTAGOGC CCCGCCACCC GCAGAGCGGG CCCAGAGGGA CCATGACCTT GGGCTCCCCC 300 

AGGAAAGGCC TTCTGATGCT GCTGATGGCC TTGOTGACCC AGGGAGACCC TGTGAAGCOG 360 

TCTCGGGGCC CGCTGGTGAC CTGCACOTGT GAGAGCOCAC ATTGCAAGGG GCCTACCTGC 420 

C6GGGGGCCT GGTGCACAGT AGTGCTGGT6 CGGGAG6AGG 6GAG6CACCC CXIAGGAACAT 480 

GGGGGCTGGO OOAACTTGCA CAGGGAGCTC TGCAGGGG6C GCCCCACCGA 6TT0GTCAAC 540 

CACTACTGCT 00GACA6CCA CCTCTGCAAC CACAAC3GTGT CCCTGGTGCT GGAGGCCACC 600 

CAACCTCCTT CGGAGCAGCC GGGAACAGAT GGCCAGCTGG CXCTGATCCT GGGCCCOGTG 660 

CrrGGCCTTGC T6GCCCTGGT GQCaXTGGGT GTCCTGGGCC TGTGGCATGT CC!GACGGAGG 720 

CAGGAGAAGC AG06TGGCCT OCACAGCGAG CTGGGAGAGT CCA6TCTCAT CXTGAAAGCA 780 

TCTGAGCAGO GOQACACSAT OTTQQGGGAC CTCCTGGACA GTGACIGCAC CACAGGGAGT 840 

G8CTCA0GQC TCCOtY fCCT GGTGCAGAG6 ACAOTGOCAC G6CA6GTTQC CTTGGT6GAG 900 

TGTGTOGGAA AAGGCCGCTA TGG0GAAGT6 T6GO0GG6CT TGTGGCA06G TGAGAGTGTG 960 

GCCGTCAAGA TCTTCTCCTC GAGGGATGAA CAGTCCTGGT TCCGGQAGAC TGA6ATCTAT 1020 

AACACAGTAT TGCTC3VGACA CGACAACATC CTAGGCTTCA TCGCCTCAGA CATGACCTCC 1080 

OGCAACTGGA GCAC8CAGCT GTGGCTCATC ACGCACTACC A06A6CA0Q8 CTCCCTCTAC 1140 

6ACTTTCTGC AGAOACAGAC GCTQ6ASCCC CATCTG6CTC TGAG6CXAGC TOTOTCCGCO 1200 

GCATOCGGCC TGGCGCACCT 6CAGGTGGAG ATCTTCGGTA CACAGGGCAA ACCAGCCATT 1260 

GCCCACCGOS ACTTCAAGAG CCGCAATGTG CTGGTCAAGA GCAACCTGCA QTOTTGCATC 1320 

GCC6ACCTGG GCCTGGCTGT 6ATGCAC7CA CAGGGCAGCG A7TACCTGGA CATOGGCAAC 1380 

AA00CGAGA8 TGGGCACCAA GG8GTACAI0 6CA0C0GAG0 TGCTGGAOGA GCAGATOOQC 1440 

ACGGACT6CT TTQAOTCCTA CAAGTGGACT GACATCTGGG CC m ' GQ C Cr G G T G CI G T 0 3 1500 

GAGATTGCCC GCOGGACCAT 06TGAATGGC ATOSTGGAGG ACTATAGACC ACCCTTCTAT 1560 

GATGTGGTGC CCAATGACCC CAGCTTTGAG GACATGAAGA AGGTGGTGTO TOTGGATCAG 1620 

CAGACCCCCA CCATCCCTAA CG6GCTGGCT GCAGACCOGG TCCTCTCAGG CCTAGCTCAG 1680 

ATGATGGQGO AGTGCTBOTA CCCAAACCCC TCTOCXXCSAC TCAOOGCGCT G0GGATCAA6 1740 

AAGACACTAC AAAAAATTAO CAACAGTCCA GA6AAGCCTA AAGTGATTCA ATAGCCCAG6 1800 

AGCACCTGAT TCCTTTCTGC CTGCAGGGGG CTGGGGGGGT GGGGGGCAGT GGATQGTGCC 1860 

CTATCTGGGT AGAGGTAGTG TGAGTGTGGT GTGTGCTGGG GATGGGCAOC TGCXJCCTOCC 1920 

TGCTOGGCCC CCAGCCCACC CAGCCAAAAA TACAGCTGGG CTGAAACCTG ATCCCCTGCT 1980 

GT C T GG OCTG CTCAAAOOGQ CAGGCTCCCT GA08CCTG6C TCTCTCOOCA CCCCTATG6C 2040 

CA6CATGGTG CACOCCCTAC CACTCCGGG8 ACAOGATGCA AAAGAQGCTC CAGAGTCAGA 2100 

GTGCCAAGCC AGGGAATCCC AGTCCCAGAC TCAGAGCCCG GGCCTGCACT TTOCCCCCTG 2160 

CCCTTGATCA ACCCCACTGC CCCACCAGAG CTGCCAGGGT GGCACAGGOC CCTGTCCAGC 2220 

CCCTGGCACA CACTTCCCTG CCAGGCCTCA GCCTCtAGCA TAA6CTCCA0 AGAGCCAGG6 2280 

OCCATCAGTT T C TCTCTGrG GATTTGTATC TCAOCTGCAT GATOOCTIGQ G C TTTCTGTC 2340 

TCCTCAACAA GA6TQCAGCT TOCTGAATGT CA6CTGCCT6 ABA6A6CTGQ 06CCT6ACTT 2400 

ACTAGGGCAT TAAATCCTAA GAGGTCCTAC TGAGGTGTGG C3«36ATCACA GGCCAGTGGA 2460 

AAAAGGGCAG GTCAGATGGO CAAGGCCCAG GACTT7CAGA TTAACTGAGA GGATATCGAG 2S20 

6CCAAGCATG 6CAGGGGGAA GGTCAGTG O G TGTCAAGAGA CCCA06TCTG ACCCX36GATG 2580 

TTIX^CTCCAT GT6ACAAAAO CAGGCCTGTC TCAGOACCTT TTCTTTTCTT TTTTCCTTCT 2640 

TTTTTTTTTT GACA0GGA6T TTOQCTCTTO TTGTCCA6GC TAGAGTGCAA TOGCATGATC 2700 

CXftGCTCACC GCAACGTCTA CCTCCCAGGT TC3UATCATT CTCTTGCCTC AGACTCCXX3A 2760 

GTAGCT6GGA TTACAGGCAC ATGCCACCAT GCCTGGCTAA TTTTGTATAT TTAGTAGAAA 2820 

CAGGGTTTCA OCATGCTGGC CATGCTGGTT CTGGAACTCC TGACCTCAGG TGTTCCACCT 2680 

A0CTCA6CCT CCXAAAGT6C TGG66TTACA QGTOTGAGCC ATOGOGCCIQ GCCAG6A0CT 2940 

TTGTTTCTTA TCTACATATT GGAAGATTTG 6TCCT6ATGT OCTTTGAGGC TTCTTTAGCT 3000 

CTAGTTCrCT GACACTTCAG CCTATATCAC AGCTAACTTC YTCAGTCTCA TCTATTCCTT 3060 

ATGCTCCAGC CCCTGGCAAT TTGCCTCAAG ATGGGGGTTT GAAAATAACT TTACCT6ACT 3120 

CAAGGA6TGT CT6QAGCACC TCCTAGTCTA AGTCTGCAAG CTCCAGTTCr T6CCIAAAAC 3180 

CATOCCASTG GCCACCCT7G G6CTCAGACA 6CICTGG0CC TTTTGAOCAC AAOOCAOCCC 3240 

CTOGCCCTCT CTGTG6CATA G' i V rr Ci'C TO CCCCAGGACT GCAQQ6006C TTCCTCCAAO 3300 

GCTTCCAAGG CTCAAAAGAA ATTTGGCTCC ATCCAAGAAO OCTCCAGCTC CCCTACTGGC 3360 

CCCTGGCTTC AGGCCCACAC CCCTGGGCCA GGSCCAGAGA GTGTGT C TCA GGAOVATTCA 3420 

ATGGGCTCIA GA6AGACACA CAGAAAGTTT GGGCATTTG6 GAAATTTTCA AGGRT6TATG 3480 
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TATOGYTCIkC GTATGGWGCA GGTTG'AWfQ GTCCYKSGGT 6CA0G6AAGT GGGCTOCAGG 3540 

6AAGTG6ATT GQAGGG6ASC TTXSA6GAATA TAAGGAGCGG GGGXGGAGAC TCAGGCTATG 3600 

GACAAGGACA GCCOCAAGGT TGCSSAACACC TGGCCTTAGT CGTCCTCAGC CTAG6GCAGG 3650 

GCaCTGAAGA AAGCTCTCCC CGCTCCTGCT GTAATGACCC AGAGTAGCCT CCCgg 3CCX5 3720 

GCATCTTATG TGTGTCTTCC ACCATtCTCA TOGTGGCACT TTTCTAGGCC TGTCTCCCAG 3780 

CATTGIGCAA GGCTOSGAAG AGAACCAG6A AGTGAAACTO GGTGAAAACA GAAAGCTCAA 3840 

TGGATGGGCT AGGTTCCCAG ATCATTAGOG CAGAGTTTSC AOC?rCCTCIQ GTTCACTGGQ 3900 

AATCCACXX31 GCCCAOGAAT CATCTCXXTC TTTGAAGGAT TTTWATTTCT ACTGGGTTTT 3960 

GGAACAAACT CCTGCTGAGA CCCCACA6CC AGAAACTGAA AGCAGCAGCT CCCCAAAGCC 4020 

TGQAAAATCC CTAAGAGAAG GCCTGGGGGA MAGGAAKTQG AGTGACAGGG GACAGGTAGA 4080 

GAGAAGG6GG CCCAATG6CC A0G6AGTGAA OaAGGTGGOS TTGCT6A6AG CAGTCTGC31C 4140 

ATGCTTCTGT CTGAGTGCAG GAAG6TGTTC CA06GIOGAA ATTACACTTC TCffShCCCGG 4200 

AGAOGCTCTT TGTGGGAGCA CTGGGCTCAT 6CCTGQCACA CAATA6GTCT GCRATAAACC 4260 
ATOGTTAAAT 0CT6AAAAAA AAAAAAAAA 

Seq ID NO: 333 Protein sequence 
Protein Accession NP_000011 



1 11 21 31 41 51 

I I I I I i 

MTLGSPHK6L LMLLMALVTQ CTPVKPSRGP LVTCTCESPH CKGPTCRGAW CTWLVREBG 60 

RHPQESRGCG MX£RBEjCBGR PTEFVUHYCC DSHLQIHHVS LVLEATQPPS BQPGTDGQLA 120 

LILGFVZALL ALVAU3VLGL KHVHRRQEKQ RGLHSEL6ES SLILKASBQG DTMLGOLLDS ISO 

DCTTGSGSQL PPLVQRTVAR QVALVECVGK GRYGEVWRGL WHGESVAVKI FSSRDEQSWP 240 

RETEIVNTVL LREDNILGPI ASDMTSRNSS TQIiWLITHYH EHGSLYDFLQ RQTLEPHLAL 300 

RLAVSAACGL AHLHVEIPGT QGKPAIAHRD FKSRNVLVKS NLQCCIADLG LAVMHSQGSD 360 

YLDIGNHPRV GTKRYMAFEV LDBQIRTDCP BSYXWTDZHA FG'LVUHEIAR RT IVMG IVHD 420 

YRPPFVOWP NDFSFEDMKK WCVZ)QQTPT IPNRLAADFV liSGLAQMMRE CHYPNPSARL 480 
TAIiRIKKTliQ KISNSPEKPK VIQ 

Seq ID NO: 334 OKA sequence 

Nucleic Acid AccesBion «s NM_004126.1 

Coding sequence i 108>329 

1 11 21-31 41 51 

I I I I I I 

GGCACGAGCT O GTG C CGGCC TrCAGTTGTT TCOGGACGCG C0GA0CTTC6 COGCTCTTCC 60 

AGCGGCTCCG CTGCCAGAGC TAGCCOCSAGC COGGTTCTGG GGC3GAAAATG OCTGCCCTTC 120 

ACATCGAAGA TTTGCCAGAO AAGGAAAAAC TGAAAATGGA AGTTGAGCAO CTT06CAAAG 180 

AAGTGAAGTT GCAGAGACAA CAAGTGTCTA AATGTTCTGA AGAAATAAAG AACTA TATT G 240 

AAGAAC6TTC TGGAGAGGAT CCTCTAGTAA AGGGAATTCC AOAAGACAAG AACXCCTTTA 300 

AAGAAAAAGO CAGCTGTGTT ATTTCATAAA TAACTTGG6A G AAACTG CAT CCTAAGTGGA 360 

AGAACTAGTT TGTTTTAGTT TTCCCAGATA AAACCAACAT GCTTTTTAAG GAAGGAAGAA 420 

TGAAATTAAA AGGAGACTTT CTTAAGCACC ATATAGATAG GGTTATGTAT AAAAOCATAT 480 

GTGCTACTCA TCTTTGCTCA CTATGCAGTC TTTTTTAAGA GAGCAGAGAG TATCAGATGT 540 

ACAATTATGG AAATAAGAAC ATTACTTGAG CATGACACTT CrTTCAGTAT ATTGCTTGAT 600 
GCTTCAAATA AAGTTTTGTC TT 

Seq ID NOt 335 Protein sequence 
Protein Accession ft: MP_004117.1 

1 11 21 31 41 51 

I ) 1 I i I 

MPAIiEIEDLP EKBKLKNEVE QLRKEVKLQR QQVSKCSEEI KNYIBBRSGB DPLVXSIPED 60 

RNPFKEKGSC VIS 



Seq ID NO: 336 DNA sequence 
Nucleic Acid Accession ftt MM_005795 
coding sequence t 555-1940 ~ 

1 11 21 31 41 51 

I t I I i I 

GCAOGAGGGA ACAACCTCTC TCTCTSCAGC AGAGAGTGTC ACCTCCTGCT TTAGG ACCAT 60 

CAAGCTCTGC TAACTGAATC TCATCCTAAT TGCAGCATCA CATTGCAAAG CTTTCACTCT 120 

TTCCCACCTT GCTTGTGGGT AAATCTCTTC TGOGGAATCT CAGAAAGTAA AGTTCCATCC 180 

TGAGAATATT TCACAAAGAA TTTCCTTAAG AGCTGGACIG GGTCTTGACC CCTGQAATTT 240 

AAGAAATTCT TAAAGACAAT GTCAAATAT6 ATCCAAGAGA AAAT8TGATT TGA6TCTGGA 300 

GACAATTGTQ CATATCGTCT AATAATAAAA ACCCATACTA GCCTATAGAA AACAATATTT 360 

GAATAATAAA AAOCCATACT AGCCTATAGA AAACAATATT TGAAAGATTG CTACCACTAA 420 

AAAGAAAACT ACTACAACTT GACAAGACTG CTGCAAACTT G AATTGG TCA CCACAACTTG 480 

ACAAGGTTGC TATAAAACAA GATTGCTACA ACTTCTAGTT TATGTTAXAC AGCA TATTTC 540 

ATTTGGGCTT AATGATGGAO AAAAAGTGTA CCCTGTATTT TCTGGTTCTC TTGOCTTTTT 600 

TTATGATTCT TOTTACAGCA 6AATTAGAAG AGAGTCCTGA GQACTCAATT CAGTTGQGAG 660 

TTACTAGAAA TAAAATCATG ACAGCTCAAT ATGAATGTTA CCAAAAGATT ATGCAAGACC 720 

CCATTCAACA AGCAGAA6GC GTTTACTGCA ACA6AACCTG GGATGGATGG CTCTG CTGG A 780 

ACGATGTTGC AGCAGGAACT GAATCAATGC AGCTCTGGCC TGATTACTTT CAGGACTTTO 840 

ATCCATCAGA AAAAGTTACA AAGAT C TOTG ACCAAGATGO AAACTGGTTT AGACATCCAG 900 

CAAGCAACAG AACATGGACA AATTATACCC AGTGTAATGT TAACACCCAC GAGAAAGTGA 960 

A6ACTGCACT AAATTTGTTT TACCTGACCA TAATTGGACA CGGATTGrCT ATTGCATCAC 1020 

T6CTTATCTC GCTTG6CATA TTCTTTTATT TCAAGAGCCT AAGTTGCCAA AGGATTACCT 1080 

TACACAAAAA TCTGTTCTTC TCATTTGTTT GTAACTCTGT TGTAACAATC ATTCACCTCA 1140 

CIGCAGIOGC CAACAACCAG GCCTTAGTAG CCACAAATCC TGTTAGTTGC AAAGT6TC0C 1200 

AGTTGATTCA TCTTTAOCTG ATQ66CT6TA ATTACTTTTG GATGCTCTOT GAAGGCATTT 1260 

AOCIACACAC ACTCATTGT6 GTG6CCGTGT TTGCAGAGAA GCAACATTTA ATGTGGTATT 1320 

ATTTTCTTGO CXGGGGATTT OCACTGATTC CTGCTTGTAX ACATGCCATT GCTAGAAGCT 1380 
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TATATTACAA TGACAATTGC TGGATCACTT CTGA3ACCCA TCTCCICXAC ATTATCCATG 1440 

GCCCAATTTG TGCTGCTTTA CTQC5TGAATC ' ITrrmViT OTTAAATATT GTAOGCGTTC 1500 

TCATCACCAA GTTAAAAGTT ACACACCAAG OGGAATCCAA TCTGTACATG AAAGCTGTGA 1560 

GAGCTACTCT TATCTTGGTG CCATTGCTTG GCATTGAATT TGTGCTGATT CCATGGOGAC 1620 

CT6AAGGAAA GATTGCAGAG GASGTATATG ACTACATCAT GCACATCCTT ATGCACTTGC 1660 

A66GTCTTTT GGTCTCTACC ATTTTCTGCT TCTTTAATOG A6AGGTTCAA GCAATTCTGA 1740 

GAAGAAACTG 6AATCAATAC AAAATCCAAT TTGGAAACAG Cl l'TT OCA AC TCAGAA6CTC 1800 

TTCGTAGTGC GTCTTACACA GTGTCAACAA TCAGTCATGG TOCAGGTTAT AGTCATGACT I860 

GTCCTAGTGA ACACTTAAAT GGAAAAAGCA TCCATGATAT TGAAAATGTT CTCTTAAAAC 1920 

CAGAAAATTT ATATAATTGA AAATAGAAGG ATGGTTGTCT CACTGTTTGG TX3CTTCTCCT 1980 

AACTCAAGGA CTT6GACCCA TGACTCTGTA GCCAGAAGAC TTCAATATTA AATGACTTTG 2040 

G6GAATGTCA TAAAGAAGA6 CCTTCACAT6 AAATTAGTAO TGT6TTGATA AGAGTGTAAC 2100 

ATCCAGCTCT ATGTGGGAAA AAAGAAATCC TGGTTTGTAA TGTTTGTCAG TAAATACTCC 2160 

CACTATGCCT GATGTGACGC TACTAACCTG ACATCACX3\A GTGTGGAATT GGAGAAAAGC 2220 

ACAATCAACT TTTCTGAGCT 6GTGTAAGCC AGTTCCAGCA CACXATTGAT GAATTCAAAC 2280 

AA AT GOCTGT AAAACTAAAC ATACATGTTQ GGCAT6A7TC TACOCTTATT CSOCC CAAGA 2340 

6A0CTAGCTA AGGTCTA7AA ACATGAA06G AAAATTAGCT TTTAGTTTTA AAA CTCTTT A 2400 

TCCCATCTTG ATTGGGGCAG TTGACTTTTT TTTTTTCCOl GAGTGCCGTA GTCCTTTTTG 2460 

TAACTACCCT CTCAAATGQA CAATACCAGA AGTGAATTAT CCCTGCTQGC TTTCTTrTCT 2520 

CTATGAAAAG CAACTGAGTA CAATTGTTAT GATCTACTCA TTTGCTGACA CATCAGTTAT 2580 

ATCTTGTGQC ATATOCATTQ TGGAAACTG6 ATGAACAGGA TGTATAATAT 6CAATCTTAC 2640 

TTCTATATCA TTAGGAAAAC ATCTTAGTTG AT6CTACAAA ACACCTTGTC AACCTCTTCC 2700 

TGTCTTACCA AACAGTGGGA GGGAATTCCT AGCTGTAAAT ATAAATTTTG CCCTTCCATT 2760 

TCTACTGTAT AAACAAATTA GCAATCATTT TATATAAAGA AAATCAATGA AGGATTTCTT 2820 

ATTTTCTTGG AATTTTGTAA AAAGAAATTG TGAAAAATGA GCTT6TAAAT ACTCCATTAT 2880 

TTTATTTTAT WSTCTCAAAT CAAATACATA CAAOCTATGT AATTTTTAAA GCAAA TATAT 2940 

AATGCAACAA TCTGTGTATG TTAATATCTG ATACTGTATC TGGGCZGATT TTTTAAATAA 3000 
AATAQAGTCT GGAATGCT 

Seq ID KO: 337 protein sequence 
Protein AcceaBion {f t I7P_005786 . 1 

1 11 21 31 41 51 

i 1 I I I ' I 

MSKKCTLYFL VLLPFFMILV TAELEBSPED 5IQLGVTRNK IMTAQYEOfQ KIMQDPIQQA 60 

BGVYCKRTWD 6WLCWKDVAA GTESKQLCPD YFQDFDPSEK VTECICDQDGH HFREPASNRT 120 

HTHYTQOJVN THERVKTALN LFYLTIIGHG LSIASLLISL GIFFYFKSLS CQRITLBKNL 180 

FPSPVCNSW TIIHLTAVAN NQALVATNPV SCKVSQPIHL YLMGCNYPW4 LCEGIYLHTL 240 

IWAVFAEKQ HLMWYYFLGM GPPIilPACIH AIARSLYXND NC5JISS0THL LYIIRGPICA 300 

ALLVNLPFLL NIVRVLITKL KVTHQAESNL YMKAVRATLl LVPUiGIEFV LIPWRPBGKI 360 

AEEVYDYIMH ILMHFQOIiLV STIPCPENGB VQA1I.RRNWK QYKIQFGMSF SNSEALRSAS 420 
YTVSTISDGP GYSHDCPSEH LIfGKSIHDIE NVLLKPENLY N 

Seq ID NOt 338 XStZA sequence 
Nucleic Acid Accession #t NM_001795 
Coding sequence* 25-2379 

1 11 21 31 41 51 

I I I I I I 

GCAOGATCTG TTCCTCCTGG GAAGATGCAG AGGCTCATGA TGCTCCTC6C CACATOGGGC 60 

GCCTGCCTGO GCCTGCTGGC AGTGGCAGCA GTGGCA6CAG CAGGTGCTAA CCCTGOCCAA 120 

GGGGACACCC ACAGCCTGCT 6CCCACCCAC CGGOGCCAAA AGAGAGATTG GATTTGGAAC 180 

CAGAIGCACA rTQATGAAOA GAAAAACACC TCACTTCCCC ATCATGTAOa CAAGATCAAG 240 

TCAAG06TGA GTCXXIAAGAA T6CCAAGTAC CTGCTCAAAG GAGAATATGT GGGCAAG6TC 300 

TTOCGGGTOG ATGCAGAGAC AGGAGAOGTG TTOGOCATTG A6AG6CTGGA COGGGAGAAT 360 

ATCTCAGAGT ACCACCTCAC TGOTGTCATT GTGQACAAGG ACACTGGTGA AAACCTGGAG 420 

ACTCCTTCCA GCTTCACCAT CAAAOTTCAT GA0GTGAA06 ACAACTGGCC TGTGTTCAOG 480 

CATGGGTIGT TCAATOOQTC OSTOCCTQAQ TC3 G T0 G GCTG TGGGGACCTC AGTCATCTCT 540 

GIGACA6CAG T6GATGCAGA CGACCCCACT GTGG6AQACC A0G0CTCT6T CATOTACCAA 600 

ATCCTQAAQG GGAAAGAGTA TTTTCCCATC GATAATTCTG GAOGTATTAT CACAATAAC3G 660 

AAAAGCTTGG ACOGAGAGAA GCAGGCCAGG TATGAQATOQ TOOTOQAAGC GCGAGATGCC 720 

CAGCGCCTCC GGGGGGACTC GGGCAOGGCC ACaSTGCTGG TCACTCTGCA AGACATCAAT 780 

GACAACTTCC OCTTCTTCAC GCAOACGAAO TACACATTTG TOSTGCCTGA AGACACCCST 840 

GTGGGCACCT CTGTOGOCTC TCTGTTTGTT GAGGACCCAG AT6AGCCCCA GAAC06QATG 900 

ACCAAGTACA GCATCTTGCG GGGC3GACTAC CAGGACGCTT TCACCATTGA GACAAACCCC 960 

GCOCACAACG AGGGCATCAT CAAGCCCATG AAGCCTCTGG ATTATGAATA CATCCAGCAA 1020 

TACAGCTTCA TOGTOGAGGC CACAGACCCC ACCATOGACC TOOGATACAT GAGCCCTCCC 1080 

G06GGAAACA GAGCCCAGGT CATTATCAAC ATCACA8ATS TGGA0GA6GC CCCCATTTTC 1140 

CAGCAGCCTT TCTACCACTT CCAGCTGAAG GAAAACCAGA AGAAGCCTCT 6ATTGGCACA 1200 

GTGCTGQCCA TGGACCXTTGA TGCGGCTAGG CATA6CATT0 GATACTCCAT CXSCAGGAtX 1260 

AGTGACAAGG 6CX3U3TTCTT COGAGTCACA AAAAAGGGGG ACATTTACAA TGAGAAAGAA 1320 

CTOGACAGAG AA6TCTACCC CTGGTATAAC CTGACTGTGO AGGCCAAAGA ACTGGATTCC 1380 

ACT6GAACCC CCACAG6AAA AGAATCCATT GTGCAASTOC ACATT6AAGT TTTG6ATGAG 1440 

AATGACAAT6 GCCOQQAGTT TGCCAAGCCC TACCA6CCCA AAGTGT6TGA 6AACGCTGTC 1500 

CATGGCX3V0C TGGTCCTGCA GATCTCC3GCA ATAGACAAGG ACATAACACC AOQAAAOGTO 1560 

AAGTTCAAAT TCACCTT6AA TACT6AGAAC AACTTTAOCX: TCACXK3ATAA TCAOQATAAC 1620 

AOGGCCAACA TCACAGTCAA GTATGGGCAG T7TGAC0QGG AGCATACC3UI GGTCCACTTC 1680 

CTACCOGTGO TCATCTCAGA CAATGGGATG CCAAGIOQCA OGOOCAOCAO CACGCTGAOC 1740 

GroGCOGT G T 6CAAGTGCAA CXZA6CA0GGC GAGTTCAOCT TCTGOGAGGA TAT6GOC360C 1800 

CAGGTGGGaS TGAGCATCC3V GGCAGTGGTA GCCATCTTAC TCTGCATCCT CACCATCACa 1860 

GTGATCACCC TGCTCATCTT CCTGCGOOGG CGGCTCCGGA AGCAGGCCCG CGCGCACGGC 1920 

AAGAGCG7GC OGGAGATCCA CGA6CAGCTG GTCACCTAOS A0GAGGA6GG CQGCGGCGAG 1980 

ATGGACACCA CCA6CTA0GA TGTGTOaGTG CTCAACT0Q8 TG0QC0S0Q8 G0666GCAA6 2040 

CCCC060GGC C0G06CT6GA OGOOO GG CCT TC0CTCTAT6 O0CAGGT6CA GAAG0CA006 2100 

AGGCACGOGC CTGGGGCACA CGGAQGGCCC GGGGAGATGG CAGCCATGAT OGAGGTGAAG 2160" 

AAGGAOGAGG OGGACCAOGA CGGCGACGGC CCCCCCTAOS ACAOGCTGCA CATCTAOGGC 2220 

TACGAG6GCT CCGAGTCCAT AGCCGAGTCC CTCAGCTCCC TGGGCAOOGA CTCATC06AC 2280 
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TCTGAOGTOO ATTAOGACTT CCTTAAOGAC TGGGGAOOCA GGTTTAAOAT GCTOGCTGAG 2340 

CTGTAG66CT CGGACCCCGG GGftGGAGCTG CTGTATTAOG OG6CGC3AGGT CACTCTGGGC 2400 

CTGGGGACCC AAACCCCCTO CAGCCCAGGC CAGTCAGACT CX31GGCACCA CW3CCTCCAA 2460 

AAATGGCAGT GACTCCCXM CCCAGCACCC L T ftX'i'UJltS GGTCCC3W5AG ACCTCATCAG 2520 

CCTTGGGATA GCAAACTCCA GGTTCCTGAA ATATCCAGGA ATATATGTCA GTGATGACTA 25 BO 

TTCTCAAATQ CTGGCAAATC CAG6CT0GT0 'n'C l 'OTCT O O GCTCAfiACAT CCACATAACC 2640 

CTGTCAOCCA CAGACC6CG6 TCTAACTCAA AGACTTCCTC TG6CTGCCCA A66CT6CMA 2700 

GCAAAACAGA CTGTGTTTAA CTGCTGCAGG GTCTmTCT AGGGTCCCTG AACGCCCTGG 2760 

TAAGGCTGtrr GAGGTCCTGG TGCCTATCTG CCTGGAGGCA AAGGCCTGGA CAGCTTGACT 2820 

TGTGGGGCRG GATTCTCTGC AGCCCATTCC CAAGGGAGAC TGACXATCAT GCXXTTCTCTC 2880 

GGGAGCCCTA GOCCIGCTCC AACTGCATAC TGCACTCCAA G TGOCOCACC ACTCCCCAAC 2940 

CXXrrCTOCAG GCCTGTCAAG AGGGAGGAAG GG6CCCX3VT0 GCAGCTCCTG'ACCTTGGGTC 3000 

CTGAAGTGAC CTGACTGGCC TGCCATGCCA GTAACTGTGC TGTACTGAGC ACT6AACCAC 3060 

ATTCAGGGAA ATGCTTATTA AACCTTGAAG CAACTGTGAA TTCATTCTGG AGGGGCAGTG 3120 

6AGATCASGA 6T6AC318ATC ACAGGGTGAG 6G0CA0CTCC ACACCCACXX: CCTCTGGAGA 3180 

ABGCCTGOAA GAGCTGAGAC CTTGCTTTGA GACTOCTCAO CACOCCTCCA GTTTTGOCPG 3240 

AGAASGGGCA GATOTTCCCG GAGATCAGAA GAOGTCTCCC CTTCTCTGCC TCACCTGGTC 3300 

GCCAATCCAT GCTCTCTTTC TTTTCTCTGT CTACTCCTTA TCCCTTCGTT TAGAGGAACC 3360 

CAAGATGTGG CCTTTAGCAA AACT6ACAAT GTCCAAACCC ACTCATGACT GCATGAOGGA 3420 

GCOGAGCATG TGTCTTTACA CCTOGCT6TT GTCACATCTC AGGGAACTGA OCCTCAGGCA 3480 

CACCTTGCAG AAGGAAGGCC CTGCCCTGCC CAACCTCTGT GGTCACCCAT GCATCATTOC 3540 

ACTGGAA06T TTCACTGCAA ACACACCTTG GAGAAGTGGC ATCAGTCAAC AGA6A6G6GC 3600 

AGGGAAGGAG ACACCAAGCT CACCXmOGT GATGGACOSA GGTTOCCACT CTGGCAAAGC 3660 

CCCTCACACT GCAAGGGATT GTAGATAACA CTGACTTGTT TGTTTTAACC AATAACTAGC 3720 

TTCTTATAAT GATTTTTTTA CTAATGATAC TrACAAGTTT CTAGCTCTCA CAGACATATA 3780 

GAATAAGGGT TTTTGCATAA TAAGCAGGTT GTTATTTAGG TTAACAATAT TAATTCAGGT 3840 

TTTTTAGTTG GAAAAACAAT TCCTGTAACC TTCTATTTTC TATAATTGTA STAATTG CTC ,3900 

TACAGATAAT GTCTATATAT TGGCCAAACT G6TGCATGAC AAGTACTGTA TTTTTTTATA 3960 
CCTAAATAAA GAAAAATCTT TAGCCT6GGC AACAAAAAAA 

Seg ID NO: 339 Protein sequence 
Protein Accession #t NP_001786 

1 11 21 31 41 51 

t ) I I I I 

MQRUWLLAT SGACLGLIAV AAVAAAGANP AQRDTHSLLP THRRQKRDMI HNQMHID2BK 60 

NTSLPHKVGK IKSSVSRKUA KYUURGEYVO KVFRVDAETG DVFAIERLDR EZ7ISEYHLTA 120 

VZVDXDTGEIf LBTPSSFTZK VHDVNQNWFV FTHRLFNASV PBSSAVGTSV ISVTAVDADD 180 

PTVGDBASVM YQIUCGXEyF AIDNS6RZIT ZTXSLOREKQ ARYEZWEAB DAQGLRGDSG 240 

TATVLVTLQD INDNFPFFTQ TKYTPyVPED TRVGTSVGSL FVEDFDEPQN RMTKYSHJIG 300 

DYQDAPTIET NPAHNBGIIK PMKPUJYEYI QQYSFIVEAT DPTIDLRYMS PPAtaillAQVI 360 

XNirriTVDEPP IFQQPFYHFQ LXENQKKPLI GTVLAMDPDA ARHSIGYSIR RTSDKGQFFR 420 

VTKKGDZiniE KSLDREVYPW YNLTVEAKBL DSTGTFTGXE 8ZVQVHZEVL DENISIAPEFA 480 

KPYQPKVCEH AVKGQLVLQZ SAZDKDZTPR NVKFKFTLNT BlHFTLTmiH DKTANZTVKy 540 

GQFDREHTBCV HFLPWISDN GMPSRTGTST LTVAVCKOIE QGEPTPCEDM AAQVGVSIQA 600 

WAILLCILT ZTVITLLIFL RRRLRKQARA HGKSVPEIHB QLVTYDEEOG GEMDTTSYDV 660 

SVLNSVRRGG AKPPRPALDA RPSLYAQVQK PFRBAPGAEG GPGEKAAMZS VKXDEADBDG 720 

DGPFYSTUrZ YGYSSSESZA ESLSSLGTDS SDSDVDrOFL NDMGPRFKNL AELY6SDPRE 7S0 
BLLY 



8eq ZD NO: 340 DNA sequence 
Nucleic Acid Accession fti NM_003088 
Coding sequence: 112-1593 

1 11 21 31 41 SI 
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G0GGAGG6TG OOTGOBGCCC GCQGCAGOOQ AACAAAGQAG CA0GQGG6CC GCOQCAGGGA 60 

CCC3GCCACCC AOntXXX SG G GCOQCGCAGC GGCCTCTCGT CTACT6CCAC CAT6AC06CC 120 

AAOOOCACAQ COSAGGOGGT GCAGATCCAO TTCGGCCTCA TCAACTGCGG CAACAAGTAC 180 

CTGA0GCCCX5 AGGOGTTCGG GTTCAAGGTG AACGCGTCOS CCAGCAGCCT GAAGAAGAAG 240 

CAGATCTGGA OGCTGGAGCA 60CCCCTGAC GAGG06GGCA G060GGCC6T GTGCCTGCOC 300 

AGCCAOCTQG GOOGCTACCT 6GO0GG6GAC AAGGACX^GCA AOGTGACCTG GGAOOGOGAG 360 

GTGCCOOGTC COGACTGCCO TTTCCTCATC GTGGC0CA05 A0GACG6TOG CTGGTC6CTG 420 

CAOTCCGAGG CGCACXX3GCG CTACTTCGGC GCCACCGAGG ACCGCCTGTC CTGCTTCGOG 480 

CAGAOGGTGT CCCCOGCCGA CAAGTGGAGC GTGCACAT06 OCATGCAOCC TCAQGTCAAC 540 

ATCTACAG1X3 TCACCOSTAA GCGCTA06GG CACCT6AG0G CGOSGCOGGC CGACX3AGATC 600 

GC O GTG G ACC G0GAG6T6CC CT6GG60GTC GACTCX3CTCA TCAOCCTOOC CTT0CA6GAC 660 

CA606CTACA G0GTGCA6AC O6C0GACCAC OSCTTCCTGC QCCAOGACGG GOGCCTOGTG 720 

GCG0G0CC50G AGCCX3GCCAC TGGCTAOUM CTOGA6TTCC GCTCCGGCAA 6GTGGCCTTC 780 

CGOGACTGOG AGG6C06TTA CCTGGCX3C06 TOGGGGCCCA G0G6CA0GCT CAAGGCX3GGC 840 

AAGOCCACCA AGGTGGGCAA GGACGAGCTC nTGCTCTGG AGCAGAGCTO GOCOCAGGTC 900 

GTGCTGCAGG 0GGCCAAC6A GAOGAAOGTG TCCAOOOGCC AOGOTATGGA OCTGTCTGCC 960 

AATCAGGAOG AGGAGACC8A CXAGGAGACC TTCCA6CTGG AGATOQACGG 06ACACCAAA 1020 

AAGTGTGCXrr TCCGTACCCA CAOGGGCAAG TACTGGAOGC TGAOGGCCAC CGGGGGCGTG 1080 

CAGTCCACCG CCTCCAGCAA QAATOCCAGC TX3CTACTTTG ACATOGAGTG GCGTGACCGG 1140 

CGCATCACAC TGAGGGOGTC CAAT66CAA6 mXJlXiACCT CCAAGAA6AA TGGGCA6CT6 1200 

GCCGCCTCGG TGGAGACAGC AGGGQACTCA 6A0CTCTTCC TCA1GAAGCT CATCAACCGC 1260 

CCCATCATC36 TGTTCCGCX3G G6AQCATGQC TTCATOQGCT GCG0CAA6GT CAC6GGCACC 1320 

CrOGAOGCCA ACOGCTCCAO CTATGAOGTC TTCCAGCTGG AGTTCAAOGA TGGCGCCTAC 1380 

AACATCAAAG ACTCCACAGQ CAAATACTGG ACGGTGGGCA GTGACTCXBC GGTCACCAGC 1440 

AGCXSGOSACA CTCCTGTGGA CTTCTTCTTC GAGTTCTGas ACTATAACAA G6TG6CCATC IS 00 

AAGGTGGGCG GG06CTACCT GAAG6G0GAC CA06CAGG0G TCCTGAAGQC CTOGGOGGAA 1560 

AOOGTGGACC COGCCTOGCT CTGGGAGTAC TAGGOCOGGC COJi'C'CJTCC COSCCCCTSC 1620 

OCACATG60C GCTCCTGCCA ACCCTCCCTG CTAACCCXrTT CTCX3KXAGG TGGGCTCCAG 1680 

G606GGAG6C AAGCCCCCTT GCCTTTCAAA CTGGAAACXX! CAGAGAAAAC GGTGCCCCCA 1740 

0CT6TOQCCC CTATGGACTC CXXACTCTCC CCTCOGCOOG GGTTCCCTAC TCCCCTOQQG 1800 
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TCftGOGGCTG CS66CCTGGCC CIGG6AGGGA TTTCAGATGC CCCTGCCCTC TTGTCTGCCH 1860 

OGGGGOBftGT CTtSGCAOCTC TTTCTTCTGA CCTCAGAOGG CTCTGAGCCT TATTTCTCTO 1920 

GAAGOQGCTA AGGGAOGGTT GGGGGCTGGG AGCCCTGGGC GTGTAGTGTA ACTGGAATCT 1980 

TTTGCCTCTC CCAGCCACCT CCTCCCAGCC CCCCAGGAGA GCTGGCXaCA TGTCCCAAGC 2040 

CT6TCAGTG0 CCCTOCCTGG TGCACT6TGC CC6AAAGCCC TGCTTGG6AA 6GGAAGCTGT 2100 

0666JVGG6CT AGGACTGftCC m Vl'GtfiXST ' n'l ' l ' rStMJffV 0GTG6CIGGA AACAGCCCCT 2160 

CTCCGAOnG G6AGAGGCTC AG O CTGGCTC CCTTOCCTQO AG0GGCAGG6 0GTGA0G60C 2220 

ACAGGGTCTG CCCGCTSCAC GTTCTGCCAA GGTGGTGGTG GOSGGOGGGT AGGGGTGTGG 2280 

GGGCOGTCTT CCTCCTGTCT CTTTCCTTTC ACCCTAGCCT GACTGGAAGC AGAAAATGAC 2340 

Ca^TCAGTA TTTTTTTTAA TGAAATATTA TTGCTGGAC3G CX3TCCCA0GC AAQCCTGGCT 2400 

GTAGTftGOGA GTGATCTGGC GG0GGGCX3TC TCAGCSlCCCr CCOCftGGGGO 7GCATCTCA6 2460 

CCCCCTCTTT COGTCCTTCX: OGTCX»GCCC CAGCCCTGGG CCTGGGCTGC C36ACACCTGG 2520 

GCCAGAGCCC CTGCTGTGAT TGGTGCTCCC TG6C3CCTCCC GGGTGGATGA AGCCAGGCGT 2 580 

CGCCCCCTCC GGGAGOCCTG GGGTGAGCOG CCGGGGCCXX: CCTGCTGCCA GCCTCCCCCG 2640 

TCCCCAACAT GCATCTCACT CTQGGTGTCT TGGTCnTTA TTTTTTGTAA GTGTCATTTG 2700 

TATAACTCTA AACGCCCATG ATACTA6CTT GAAACTGGAA ATAOOQAAAT AAAATAACTC 2760 
AtnCTGC 

Seq ID NO: 341 Protein sequence 
Protein Accession ft: NP_003079 

1 11 21 31 41 51 

1 I I i I I 

MTANGTAEAV QIQPGIjINOG NKYLTAEAFG FKVNASASSL KKKQIWTLBQ PPDEAGSAAV 60 

CLRSHLGRYL AADKDGKVTC EREVPGPZX31 FLIVAHDDGR HSIXJSEABRR YFGGTEDRLS 120 

CFAQTVSPAE KHSVHIAMEF QVNIYSVTRX RYAHIiSARPA DEIAVDRDVP WOVDSLXTXA 180 

FQDQRySVQT ADaRFLRHDG RLVARPEPAT GYTXiEFRSGX VAFSDCBGRY ZAPS6P5GTL 240 

KAGKATKVGK DSLFALEQSC AQWLQAANS RNVSTRQQ4D LSANQDEETD QETFQLEIDR 300 

DTKKCAFRTH TGKYWTLTAT GGVQSTASSK NASCYPDIEW RDRRITLRAS NGKFVTSKKN 360 

GQLAASVBTA GDSELPLMKL INRPIIVFRG EBGPIGCRJCV T6TLDAKRSS YDVFQLEFND 420 

GAYNIKDSTG KYW7VGSDSA VT8S6DTPVD FFFBFCDYNK VAIKVGGRYL KGDHAGVLKA 480 
8AETVDPASL WEY 



Seq ID HO: 342 DNA sequence 

Nucleic Acid Accession #: PQSHESR predicted 

Coding sequence: 660. .1705 

1 11 21 31 41 SI 
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CGCTCOGCAC ACATTTCXrrO TOSOGOCCTA AGGGAAACTG TTGGCCGCTG GGCC0GCX3GG 60 

G6GATTCTTG GCAGTTGGGG GGTCXXSTOSG GAGCGAG06C GGAGGGGAAG G6AGGGGGAA 120 

COGGGTTGGO GAAGCCA6CT GTAGA0GGO6 6T6AC0Q0SC TCGAGACACA OCTCTGOGTC 180 

CTCGAGCX3GG ACA6ATCCAA GTTGGGAGCA GCTCTOGGTG 060G6CCTCA GAOAATGAGG 240 

CCGGCGTTOG CCCTGTGCCT CCTCTGGCAO GCGCTCTGGC CCGGGCCGGG OGGCGGOGAA 300 

CACCCCACTG CCGACCGTGC TGGCTGCT03 GCCTCGGGGG CCTGCTACAG CCTGCACCAC 360 

GCTACOITGA AGCGGCAGGC GGCCX5AGGAG GCCTGCATCC TGOGASGTGG GGCGCTCAGC 420 

ACOGTGOCnQ GQGGOSCXXSA GCTGOGGGCT GT6CTCG0GC TCCTQ08GGC AGGCCCAG60 480 

CC06GA9GQ0 OCTCCAAAGA CCTGCTGTTC T0G6T06CAC TG6A003CA0 GOGTTCCCAC 540 

TGCACCCTGG AGAACGAGCC TTTGOGGGGT TTCTCCTGGC TGTCCTCCGA CCCCGGCGGT 600 

CTCGAAAGCG ACACGCTGCA GTGGGTGGAG 6AGCCCX»AC GCTCCTGCAC CGCGCGQAGA 660 

TGCGOGGTAC TCCAGGCCAC CGGTG6GGTC GAGCCXXX3VG CTGGAAGGAO ATGC3GATGCC 720 

AOCTGG60GC CAAOQGCTAC CTGTGCAAGT ACCAGTTTGA GGTCTTGTGT CCTGG6C08C 780 

OOCCGOOGQC OGCCTCTAAC TTGAGCTATC G0600C0CTT CXaOCTQCAC AOCGOOGCTC 840 

TGGACTTCAG TCCACCTGGG ACCGAGGTGA GTGOOCTCTQ COGGGGACAG CTCCOGATCT 900 

CAGTTACTTG CATOGOGGAC GAAAT0GGCX3 CTCOCTQGGA CAAACTCTOG GGCGATGTGT 960 

T6TGTCCCTG CCCOGGGAGO TACCTCCGTG CTGGCAAATG OGCAGAGCTC CCTAACTGCC 1020 

TAOAGGACTT GGGAOGCTTT QCCTOOQAAT GTGCTA0GG6 CTT0GA6CT0 GGOAAGGAOQ 1080 

GC06CTCTT0 TGTGACCA6T GGGGAAGGAC AGCXXIACCCT TGGGGGGACC GGGGTGCCCA 1140 

CCAGGCGCCC GCCGGCCACT GCAACCAGCC CCGTGC06CA GAGAACATGG CCAATCAGGG 1200 

TC3GACGAGAA GCTGGGAGAG ACACCACTTG TCCCTGAACA AGACAATTCA GTAACATCTA 1260 

TTCCTGAGAT TCCT0GAT6G GGATCACAGA 6CACGATGTC TACCCTTCAA ATGTCCCTTC 1320 

AAGCC6AGTC AAAGQGCACT ATCACCCCAT CAG06AGCGT GATTTGCAAG TTTAAT TCTA 1380 

CGACTTOCTC TOCCACTCCT CftGGCTTTOG ACTCCTCCTC TGC03TQGTC TTCMarrTQ 1440 

TGAGCACAGC AGTAGTAGTG TTGGTGATCT TGACCATGAC AOTACTGGGO CTTGTCAAGC 1500 

TCTGCTTTCA OGAAAGCCCC TCTTCCCAGC CAAGGAAGGA GTCTATGGGC CCGOOGGGCC 1560 

TGGAGAGTGA TOCTGAGCCC GCXGCTTTGO GCTCCAGTTC TGCACATTGC ACAAACAAT6 1620 

GGGTGAAAGT. GGGOOACTOT QATCTOGGOO ACAGAGCAGA GGGTGOCTTO CTGQ08QA0T 1660 
COCCTCTTGO CTCTAGTGAT QCATAG 

Seq ID NO: 343 Protein sequence 
Protein Accession #: FGENESE predicted 

1 11 21 31 41 51 

I I I I I I 

FK3KDFMTKTP KAFATKAKID KHDLIKLKSP CTAKETIIRV NSQPTDWQKT PAIYPSDKGV 60 

ZARIYKELEQ lYKKKKPTKT LRTHFLSRPK QfCHPLGPRG DSHQIX3GPSG ARABGKGGGT 120 

GU3KPAVBGG DRAFDTALRP RAOQIQVGSS SAOOASEHSA GVRFVPPLAG ALARAGSRRT 180 

PBCRPCHLLG LGGLLQPAFR YHEAAGGRGG I£PARWGAQE RAOGRRAARC ARAPAGRPRA 240 

RRGLQRPAVL GRTGAQAPPL BPGERAFAGP XiLAVLRPRRS RKRHAAVGGG APTLLHRAEM 300 

RGTPGHRW6R ARSHKEMRCH LRANGYLCKY QFEVLCPAPR PGAASNLSYR APFQI<HSAAL 360 

DPSPPGTEVS ALCRGQLPIS VTCIADEIGA RHDKIjSGDVL CPCPGRYUtA GKCAELPNCL 420 

DDLGGPACBC ATGPELGKDG RSCVTSGEGQ PTLGGTGVPT RRPPATATSP VPQRTHPIKV 480 

OHKIiGBTPLV PBG2DNSVTSI PBIPRH6SQS TMSTLQMSLQ AESK AT1TP9 GSVISKFNST 540 

TSSATPQAFD SSSAWFXFV SXAVWLVXXi TMIVLGLVKIi CFBESPSSQP RKBSMQPP6L 600 
ESDFEPAAU3 SSSABCIIQIQ VK9GIK33LBD RAB8ALLABS PLGSSDA 
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Seq ID KO: 344 ONA sequence 
nucleic Acid Accession #; »M_012072 
Coding sequence: 149-2107 

1 . 11 21 31 41 51 

AAAGCCCTCA GCCTTTGTGT CCTTCTCTGC QCOaSMJKG CTGCaGCTCA OCXXTTCAGCT 60 

CCCCTTCGGG CCCAGCTGGG AGCCGAGATA QAAGCTCCTG TOGCOSCTGG GCTTCTCGCC 120 

TCCOSCAGAG GGCCACACAG AOACXXSGGAT GGCCACCTCC AT6GGCCTGC TGCT6CXGCT 180 

GCTGCTGCrC CTGACCCAGC CCX3GGG0GGG GA0GG6AGCT GACAOGGAGG CGGTGGTCXG 240 

CXnXSGGGACC GCCTCCTACA OGOCCCACTC GGGCAAGCTO AGOGCTGCOS AGGCOCAGAA 300 

CCaCTGCAAC CAGAAOSGGG GCAACCTGGC CACTGTGAAG AGCAAGGAGG A£3GCCCA0CA 360 

0OTCCAGC3GA GTACTGGCCC AGCTCCTGAfl GCGC3GAGGCA GCCCTGAOGG CGAGGATGAG 420 

CAACn-rCTGG ATTGGGCrCC AGCGAGAGAA GGGCAAGTGC CTGGACCCTA CSTCTCCCGCT 480 

GAAGGGCTTC AGCTGGGTGG G0GGG6GGGA GGACACGCCT TAC TCTA ACT OGCACAAGGA 540 

GCrCOSCSAAC TOGTGCATCT CCAAGOSCTG TGTGTCTCTG CTGCTGGACC TSTCCOgCC 600 

OCTCCTTCCC AACOGCCTQC CCAAfiTGGTC TGAGGGCCCC TGTGGGAGOC CftGGCTOCCC 660 

OGGAAGTAAC ATTCAQ6GCT TC G TGTGCAA GTTCAGCTTC AAAGGCATGT GCOGGCCTCT 720 

GGCCCTGGGG GGCCCAGGTC AGGTGACCTA CACCACCCCC TTtXAGACCA CCAGTTCCTC 780 

CTTGGAGOCT GTGCCCTTTG CCTCTGOSGC CAATGTAGCC TGTGG GGAAG OXGACAAGGA 840 

OGAGACTCAG AGTCATTATT TCCTOTGCAA GGAGAAGGCC CCCGATGTGT TCQACTGGGG 900 

CSMSCTOGOQC CCCCTCTGTG TCAG0C3CCAA GTATGGCTGC AACTTCAACA ATGGGGGCTG 960 

CCACXaGGAC TGCTTTGAAG GGOGSGATOG CTCCTTCCTC TGCGGCTGCC GACCAGGATT 1020 

CCGGCTGCTG GATGACCTCG T6ACXTCTGC CTCTCGAAAC CCTTGCAGCT CCAGCCCATC 1080 

TCGTGGGGGO GCCACGTGCX3 TCCTGGGACC CCATGGGAAA AACTACAOOT GCOGCTOCCC 1140 

OCAAGGGTAC CAGCTGGACT CGAGTCAGCT GGACTGTGTG GAOGTGGATG AATGCCAQGA 1200 

CTCCCCCTGT OCCCAGGAGT GTGTCAACAC CCCTGGGGGC TTCCGCTGCG AATGCTGiSGT 1260 

TGGCTATGAfi COGGGOOGTC CTGCSAGAiSGG GGCCTGTCAG GATGTGGATC ACrrOTGCTCT 1320 

GGGTCGCTOG CCTTOCGCCC AGGGCTGCAC CAACACAGAT GGCTCATTTC ACTGCTCCTG 1380 

TCAGGAGGGC TAOGTCCTGG CCX3GGGAGGA CGGGACTC3U5 TGCCAGGACG TGGATGAGTG 1440 

TGTCGGCCOG GGGGGCCCCC TCTGOSACAG CTTGTGCTTC AACACACAA6 GGTCCTTCCA ISOO 

CTGTGOCTOC CTOCCAGGCr GQCTOCTOGC OXAAATGGG GTCTCTTGCA CCATGGGGCC 1560 

TOTOTCrCTG GGACCACCAT CTGGGCCOCC C6ATGAGGA0 GACAAAGGAG AGAAAGAAGG 1620 

GAGCACOGTG CCCCGCGCTG CAACAGCCAG TCCCACAAGG G6CCCCS3AGG GCACCCCCAA 1680 

GGCTACACCC ACCACAAGTA GACCTTCGCT GTCATCTGAC GCCCCCATCA CATCTGCCCC 1740 

ACTCAAGATG CTCGCCCCCA GTGGGTCCTC AGGCGTCTGG AGGGRGCCCA GC3VTCCATCA 1800 

CGCCACAOCT GCCTCTOGCC CCCAOOAQCC TGCRGeiGGG GACTOCTOOO TGGCCACACA 1860 

AAACAAOGAT GOCACTGACG GGOUUU^aCT GCTTTTATTC TACATCCTAO GC3VCCGTGGT 1920 

GGCCATCCTA CTCCTGCTGG CCCPGGCTCT GGGGCTACTQ GTCTATOSCA AGOGGAGAGC 1980 

GAAGAGGGAG GAGAAOAAGG AGAAGAAGCC CCAGAATGCG GCAGACAGTT ACTCC TQGGT 2040 

TCCAGAGCGA GCTGAGAGCA GGGCCATGGA GAACCAGTAC AGTCC3GACAC CTGOGACAGA 2100 

CTGCTGAAAG TGAGGTCQCC CTAGAGRCAC TAOAGTCACC AGCCAOCATC CTCAGAGCTT 2160 

TGAACTCCCC ATTOCAAAGO G6CACCX3VCA TTTTTTTGAA AGACTGGACT GGAATCTTAG 2220 

CAAACAATTG TAAGTCTCCT CCTTAAAGGC CCCTTGGAAC ATGCAGGTAT TTTCTACGGG 2280 

TGTTTGATGT TCCTGAAGTG GAAGCTGTGT GTTGGOSTGC CACGGTGGGG ATrTOGTGAC 2340 

TCTATAATGA TTCTTACTCC OXTXtA XVl l' TCAAATTCCA ATGTGACCAA TTCOSGATCA 2400 

GGGTGTCAGG AGGCTCGGGC TAAGGGGCTC CCCTGAAIAT CTTCTCTGCT CACTICCACC 2460 

ATCTAAQAOO AAAAGGTCAG TTGCTCATGC TGATTAOGAT TGAAATOATT TGTTTCTCTT 2520 

CCTAGGATGA AAACTAAATC AATTAATTAT TCAATTAGGT AAGAA6ATCT GGTTTTTTGG 2580 

TCAAAGGGAA CATGTTCGGA CTGOAAACAT TTCTTTACAT TTGCATTCCT CCATTTCGCC 2640 

AGCACAAGTC TTGCTAAAK5 TGATACTGTT GACATOCTCC AGAATGGCXA GAAG TGCAAT 2700 

TAACCTCTTA GGTGGCAAOG AGGCAOaAAQ TGCCTCTTTA GTTCrTACAT TTCTAATAOC 2760 

CTTGGGTTTA TTTGCAAAGG AAOCTTQAAA AATATGAGAA AAGTTGCTTO AAG TOCATTA 2820 

CA6GTGTTTG TGAAGTCACA TAATCTACGG GGCTAGGGCG AGAGAGGCCA GGGATTTGTT 2880 

CACAGATACT TGAATTAATT CATCCAAATG TACTGAGGTT ACCACACACT TGACTACGGA 2940 

TGTGATCAAC ACTAACAAGG AAACAAATTC AAGGACAACC TGTCTTEGAG CCAGGGCAGG 3000 

CCTCAGACAC CCTGCCTGTG GCCCOOOCTC CACTTCATCC TGCCX3GGAAT GCCAGTOCTC 3060 

OGAGCTCAGA CACAGGAAGC CCTGCAGAAA GTTCCATCAG GCTQTTTCCT AAAGGATGTG 3120 

TOAACGGGAG ATGATGCACT GTGTTTTGAA AGTTGTCATT TTAAAGCATT TTAGCACAGT 3180 

TCATA6TCCA CAGTTGATGC AGCATCCTGA GATTTTAAAT CC TGAAG TGT GGGTGGOSCA 3240 

CACACCAAGT AGGGAQCTAO TCAGGCAGTT TGCTTAAGGA ACTTTTGTTC TCTGTCTCTT 3300 

TTCCTTAAAA TPGGOQGTAA GGAGGGAAGQ AAGAOOGAAA GASATGACTA ACXAAAATCA 3360 

TTTTTACAGC AAAAACTGCT CAAAGCCATT TAAATTATAT CCTC31 TTTTA AAAOTTACAT 3420 

TTGCAAATAT TTCTCCCTAT GATAATGCAG TCGATAGTGT GCACTCTPTC TCTCTCTCTC 3480 

TCTCTCTCAC ACACACACAC ACACACACAC ACACACACAC AGAGACAOGG CACCATTCTG 3540 

CCTGGGGCAC TGGAACACAT TCCTGGGGGT CACCGATGGT CASAGTCACT AGAAGTTACC 3600 

TGAGTATCTC TGG6AGGCCT CATGTCTCCT GTGOaCTTTT TACCACCACT GT QCAGGA^ 3660 

ACAGACAGAG QAAATOTQTC TCCCTCCAAQ OCOCCAAAGC CTCAGAGAAA GOOTGTTTCT 3720 

GGTTTTOCXrr TAQCAATGCA TCGGTCTCTG AGGTGACACT CTGGAGTGGT TGAAGGGOTi 3780 

CAAGGTGCAG GGTTAATACT CTTGCCAGTT TTGAAATATA GATOCTATGG TTCAGAT TGT 3840 

TTTTAATAGA AAACTAAAGG GGCAGGGGAA GTGAAAGGAA AGATGGAGGT iVt^^^Q SGC 3900 

TOQATGGGGC ATTTGGAACT TCTTTTTAAA GTCATCTCAT GGTCTCCA6T TTT CAimqQ 3960 

AACTCXt3Q7Q TTTAACACTT AAGGGAGACA AAGGCT6TGT CCATTTG6CA AAACTTCCTT 4020 

GGCCAOGftOA CTCtAOCSTGA TGTGTGAAGC TQGGCAGTCT GTGGTGTOGA GA6CAGCCAT 4080 

CTOTCTG G CC ATtCAGAGSA TTCTAAAQAC ATGGCTGGAT GCGCTGCTGA CCAACATCAG 4140 

CACTTAAATA AATGCAAAT6 CAACATTTCT CCCTCTGGGC CTTGAAAATC CTTGCCCTTA 4200 

TCATTTGGGG TCAAGQAGAC ATTTCTGTCC TTGGCTTOOC ACAGCX30CAA OG CAGTC TGT 4260 

GTATGATTCC TGG6ATCCAA CGAGCCCTCC TATTTTCACA GTGTTCTGAT TGCTCTCACA 4320 

OCXXaGOCCC ATCGTCTOTT CTCTGAATGC AGCCCTCTTC TCAACAACAG GGAGGTCATG 4380 

GAACCCCTCT GTGGAACCCA CAAGOGGAGA AATGGGTGAT AAAGAATCCA GTTCCTCAAA 4440 

ACCTTCCCTO GCAGGCT GG G TCCCTCTCCT GCTGGGTGGT GCTTTCTCTT GCACACCACT 4500 
CCCACCAOGG GGGGAGAGCC AGCAACCCAA OCAG ACAG CT CAgj nt»-TGC ATCTGAT QGA 4560 
AAOCACTGGG CTCAAACAOG TGCTTTATTC TCCTGTTTAT TTTTGCTGTT ACTTTGAAOC 4620 

ATOGAAAtTC TTOTTTGOGG GAtCTTQQGG CTACAGTAQT GGGTAAACAA A TQCCCACC G 4680 
60CAAGAG6C CATTAACAAA TOGTCCTTGT CCTGAGGGGC CCCAGCTTGC TCGGOCGTGG 4740 
CACAGTGGOO AATCCAAGGO TCACAGTATG GQGAGAGGTG CACXXTTGCCA CCTGCTAACT 4800 
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TCTOGCT A GA, CACAGItrCTT CTGCCCAGGT GAOCTGTTCA GCAGCAGAAC AAG0CAG66C 4860 

CATGGGGA06 G6GGAAGTTT TCACTTGGAG ATG6AC3UXA AGAOUITGAA GUTTTGTTGT 4920 

CCAAATAGGT CAATAATTCT GGGAGACTCT TGGAAAAAAC TGAATATATT CAGGACCAAC 4980 

TCTCTCCCTC CCCTCATCCC ACATCTCAAA GCAGACAATG TAAAGAGAGA ACATCTCACA 5040 

CACC3CAGCTC GCCATGCCTA CTCATTCCTG AATTPCACGT GCCATCACTG CTCTTTCTTT 5100 

Cl ' lXTi 'T U ' i ' C ATTTGAGAAA GGATGCAGGA GGACAATTCC CACAGATAAT CIGAOQAAIG 5160 

CAGAAAAACC A6GGCAGGAC AGTTAT06AC AATGCATTA6 AACTTGGTGA 6CATCCTCTG 5220 

TAGAGGGACT CCACCCCTGC TCAACAGCTT GGCTTCCAGO CAAGACCAAC CACATCTGGT 5280 

CTCTGCCTTC GGTGGCCCAC ACACCTAAGC GTCATCGTCA TTGCCATAGC ATCATGATGC 5340 

AACACATCTA 06TGTAGCAC TAOGAOGTTA TGTTT6GGTA ATGTGG6GAT GAACTGCATG 5400 

AGGCTCTGAT TAAGGATGT6 GGGAAGTGGG C7GCGGXCAC TGTOGGOCTT GCAAGGOCAC 5460 

CTGGAGGCCT GTCTGTTAGC CAGTGGTG G A GGAGCAA66C TTCAS6AAG0 GCCAGCCACA 5520 

TGCCATCTTC CCTGOGATCA GGCAAAAAAG TGGAATTAAA AAGTCAAACC TTTATATGCA 5580 

TGTGTTATGT CCATTTTGCA GGATGAACTG AGTTTAAAAG AA iT'i ' i i ' li ' l TCTCTTCAAG 5640 

rPGCTTTGTC TTTTCCATCC TCATCACAAG CCCTTGTTTG AGTGTCTTAT CXCTGAGCAA 5700 

TCTTTOGATQ GATGGAGATG ATCATTAGGT ACTTTTGTTT CAACCTTTAT TGCTGIAAAT S760 

ATTTCT67GA AAACTAGGAG AACAGAGATG AGATTTGACA AAAAAAAATT QAATTAAAAA 5820 

TAACACAGTC TTTTTAAAAC TAACATAGGA AAGCCTTTCC TATTATTTCT CTTCTTAGCT 5880 

TCTCCATTGT CTAAATCAGG AAAACAGGAA AACACAGCTT TCTAGCAGCT GCAAAATGGT 5940 

TTAATGCCXX: CTACATATTT CCATCACCTT GAACAATAGC TTTAGCTTGG GAATCTGAGA 6000 

TATGATCCCA GAAAACATCT GTCTCTACTT CGGCT6CAAA ACCCATGGTT TAAATCTATA 6060 

TGGTTT6TGC ATTTTCTCAA CTAAAAATAG AGATGATAAT COGAATTCTC CATATATTCA 6120 

CTAATCAAAG ACACTATTTT CATACXAGAT TGCIGAGACA AATACTCACT GAAGGGCTT6 6180 

TTTAAAAATA AATTGTGTTT T6GTCTGTTC TTCTAGATAA TGCCCTTCTA TTTTAGGTAG 6240 

AAGCTCTGGA ATCCCTTTAT TGTGCTGTTG CTCTTATCTG CAAGGTGGCA AGCAGTTCTT 6300 

TTCAGCAGAT TTTGCCCACT ATTCCTCTGA GCTGAAGTTC TTTGCATAGA TTTGGCTTAA 6360 

GCTTGAATTA GATCCCTOCA AAGGCTTGCT CTGTGAT6TC AGATGTAATT GTAAATGTCA 6420 

GTAATCACTT CATGAATGCT AAATGAGAAT 6TAAGTATTT TTAAATGTGT GTATTTCAAA 6480 

TTTGTTTGAC TAATTCT6GA ATTACAAGAT TTCTATGCAG GATTTAOCTT CATCCTGTQC 6540 

ATGTTTCCCA AACTGTGAGG AGGGAAGGCT CAGA6ATCGA GCTTCTCCTC T6AGTTCTAA 6600 

CAAAATGGTG CTTTGAGGGT CA6CCTTTAG GAAGGTGCA6 CTTTGTTGTC CTTTGA6CTT 6660 
TCTGTTATGT GCCTATCCTA ATAAACTCTT AAACACATT 

Seq ID HOs 345 Protein sequence 
Protein Accession #: NP_036204 

1 11 21 31 41 51 

I I I I 1 I 

MATSKGLLLL LLLLLTQPGA GT6AZ»TEAW CVGTACYTAH S6XLSAAEAQ NHC33QNG(3ni 60 

ATVKSKCEAQ HVQRVLAQLL RREAALTARM SKFHIGIiQRB KGKCLDPSLP LKGFSWVGGG 120 

EDTPYSNUHK ELRNSCISKR CVSIiLLDLSQ PLLPNRLPKH SEGP06SPGS PGSKIBGFVC 180 

KFSFX04CRP LAIiGGPGQVT YTTPFQTTSS SLEAVPFA5A AKVAOGEGDK DETQSHYFEK: 240 

REKAPOVFDM GSSOPIiCVSP KSrGOiFNKGG CHQDCPEGGD GSFLOGCRPG FRLLDDLVTC 300 

AS8NPCSSSP CRGGATCVLO PBGKNYTCRC PQCYQliDSSQ LDCVDVDECQ DSPCAQBCVN 360 

TPG6FRCECW VGYEP6GPGE GACQDVDBCA LGRSPCAQGC TNTDGSFHCS CEEGYVLAGE 420 

DGTQCQDVDB CVGPGGPLCD SLCFNTQGSF HCGCLPGWVIi APNGVSCTMG PVSLGPPSGP 480 

PDEEDKGEXE GSTVPRAATA SPTRGPEGTP KATPTTSRP5 LSSDAPITSA PLKKLAPSGS 540 

SGVWRBPSXB HATAASSPQE PAGGDSSVAT QMNDGTDGQK LLLFYILGTV VAZLLLLALA 600 
LGUiVYRKRR AKREBiCKEKK PQNAADSYSW VPERABSRAM ENQYSPTPGT DC 

8eg ID NO: 346 DNA sequence 
Nucleic Acid Accession it Z31560 
Coding sequences <l-966 

1 11 21 31 41 51 

i I I I I I 

CACAGGGCCC GCAT6TACAA CA1X»TGGAG AOQGAGCTGA AG00QCCQG6 CCCGCAGCAA 60 

ACTT066GGG 6C6G0G6C6G CAACTCCACC GCQGOQGOGG CC8G0Q6CAA CCAGAAAAAC 120 

AGCC06GACC 60GTCAAGCG 6CCCATGAAT GCCTTCATQQ T GTGGTCCCG C666CA60GG 180 

OGCAAGATGG CCCAOGAGAA CCCCAAGATG CACAACTCGG AGATCAGCAA GCGCCTGGGC 240 

GCCGAGTG6A AACTTTTGTC 6GAGACGGAG AAGOGGCCGT TCATOSACGA GGCTAAGOGG 300 

CTG0GA6CGC T6CACATGAA GGAGCACCC6 6ATTATAAAT AC0GGCCCC6 G036AAAACC 360 

AAGAGGCTCA TGAA6AAGGA TAAGTACACG CT6CC0GOOQ GGCTGCTGGC CCOOG60GGC 420 

AATA6CATG6 06AG0GGG6T C6GGGTGGGC GCCG6CCTGG 606GGGGC6T GAACCA60GC 480 

ATQGACAGTT ACGOGCACAT GAACGGCTGG AGCAAC6GCA GCTACAGCAT GATGCAGGAC 540 

CAGCTQGGCT ACCOGCAGCA CCCGGGCCTC AATGOGCACG GCGCAGOGCA GATGCAGOCC 600 

AT6CACCGCT AC6ACGTGAG OGCCCTGCAG TACAACTCCA TGACCAGCTC GCA6ACCTAC 660 

ATGAAOGGCT OBCOCACCTA CA6CAT6TCC TACTCGCAQC AGGGCACCOC TGGOITGGCT 720 

CTT6GCTCCA TGGGTTOGGT GGTCAAGTCC GAGGCCAGCT CCAGCCCCCC T6TGGTTACC 780 

TCTTCCTCCC ACTCCAGGGC GCCCTOCCAG 6CO0GGGACC TCOGGGACAT GATCAGCATG 840 

TATCTCCCOQ GCGCXX5AGGT GCCGGAACCC GC06CCCCCA GCAGACTTCA CATGTCCCAG 900 

CACTACCAGA GCGGCCCGGX GCCOGGCACG GCCATTAAGG 6CACACTGCC CCTCTCACAC 960 

ATGT6A0GGC 0GGACAGO6A ACTGGAGGGG GGAQAAATTT TCAAAOAAAA ACSAGGGAAA 1020 

TGGGAGG6GT GCAAAAGAG6 AGAGTAAQAA ACAOCATGGA GAAAACCCG6 TAC6CTCAAA 1080 
AAAAA 

Seq ID NO: 347 Protein sequence 
Protein Accession ft: CAA834a5 

1 11 21 31 41 51 

I 1 I I I I 

BSARMYNMME TBLKPPGPQQ TSGGGGOiST AAAAGGNQKN SPDEVKRPMN AFMVHSRGQR 60 

RKMAQENPKM BKSEISKRLG AEWKLLSBTE XRPFIDBAKR LRAIiBMRSHP DYICVRPRfiKT 120 

KTLMKiaSRYT LPGGLLAPGG HSMASGVGVQ AQLGA6VNQR MDSyABMKGH SNGSYSMMOD 180 

QLGYFQHPGL KABGAAQMQP MHRYDVSAIiQ IQlSMTSSQTy MNGSPTYSKS YSQQGTPGMA 240 

LG5»3SWKS EASSSPPWT SSSHSRAPCQ A13)IiSDKISM YLPGABVPEP AAPSRLHHSQ 300 
HyQSGFVPGT AISGTLPISB N 
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Seq ID KOi 349 DZOl sequence 
Nucleotide Accession ft: NM_002638 
Coding sequence I 120-473 

1 11 21 31 41 SI 

I I I 1 I I 

CAATACAiSCT AAGGAATTAT CCCTTGTAAA TACCACAGAC COGCCCPGGA GCC31GGCCAA 60 

GCTGGACTGC ATAAAGATTG GTATGGCXrTT AGCTCTTAGC C3UUVCACCTT CCTGACACCA 120 

TCAGGGCCAO CACCTTCTTG ATC3GTOGTCG TCTTCCTCAT OSCTGGGAOG CTGGTTCTAO 180 

AG6CA6CT6T CAOGGGAGTT OCTGTTAAAG GTCAAGACAC TOTCAAAGGC OQTGTTOCAT 240 

TCAATGGACA AGATCX:CGTT AAAGGACAAG TTrCAGTTAA AGGTCAAGAT AAA6TCAAAG 300 

CGCAAGAGCC AGTCAAAGGT CCAGTCTCCA CTAAGCXTGG Cr C CTGCCOC ATTATCTTGA 360 

TCCGGTGCGC CATGTTGAAT CCCCCTAACC GCTGCTTGAA AGATACTGAC TGCCCAGGAA 420 

TCAAGAAGTG CTGTGAAGGC TCTTGGGGGA T6GCCTGTTT OGTTCCOCAG TGAAGGGAGC 480 

CGGTCCTTGC TGCACCTGTG CCGTCCCCAC ACCTACAGOC CCCATCXGGT CCTAAGTCCC 540 

TGCTGCCCTT CCCCTTCCCA CACTGTCCAT TCTTCCTCCC ATTCAGCSATG 0CCA0GGCT6 600 
GAGCTGCCTC TCTCATCCAC TTTCCAATAA A 

Seq ID NO: 349 Protein sequence: 
Protein Accession ft: KP_002629 

1 11 21 31 41 51 

I I I I 1 I 

MRASSFblW VFLIAGTLVL EAAVTGVPVK 6QDTVKGRVP FNGQDFVKGQ VSVKGQDKVK 60 

AQEFNOCGPVS TKF6SCPIZL IRCAMWFN RCLKDIDCPG IKKOCBGS06 HACFVPQ 



Seq ID NO: 350 DNA sequence 
Nucleic Acid Accession fti NM_007183 
Coding sequences 75-2468 

1 11 21 31 41 51 

I I I I 1 I 

GAATTCCOGA CAGGACGTGA AGATAGTTGG GTTTGGA6GC GGCOSCCAGG CCCAGGCCOG 60 

GTGGACCTGC CGCCATGCAG GACGGTAACT TCCTGCTGTC GGCCCTGCAQ CCTGAGGCCG 120 

GOGTGTGCTC CCTGGOGCTO CCCTCTGACC T6CAGCTGGA CCG C OGGGGC GCCGAGGGGC 180 

GGGAGGOOSA GOGGCTGCGG GCAGCC0GC6 TCCAGGA6CA GGTCC6C6CC GGCCTCTTGC 240 

AQCTGGGACA GCA6CG6CG6 CACAACG6GG 0C6CTGAGCC GGAGOCTGAG GCCQAGACTO 300 

CCAGAGGCAC ATCCAGGGGG CAGTACCACA CCCTGCAGGC TGGCTTCAOC TCTCGCTCTC 360 

AGGGCCTOAG TGGGGACAAG ACCTCGGGCT TCCOGCCCAT 03CCAAGCCG GCCTACAGCC 420 

CAGOCTOCTG GTCCTCCCGC TCOGCCGTGG ATCTGA6CTG CAGTCGGAGG CTGAGTTCAG 480 

CCCACAATG6 GGGCAGOGCC TTTGGQGCOQ CTGGGTACGG GGGTGCCCA6 CCCACCCCTC 540 

CCATGCCCAC CA66CCCGTG TCCTTCCAT6 AGC6066TG6 06TTGGGAGC CGGGOOSACT 600 

ATGACACACT CTCCCTGCGC TCGCTGCGGC TGGGGCCOGG GG6CCTGGAC GACCGCTACA 660 

6CCPGGTGTC TGAGCAGCTG GAGCCCGCGG CCACCTCCAC CTACAGGGCC TTTQCGTA03 720 

AGCGCCAG6C CAGCTCCAGC TCCA6CCGGG CAGGGGGGCT GGACTQGCCC GA6GCCACTG 780 

AGGTTTCCCC GAGCC6GACC ATOOOTGCCC CTGCCGTGCQ GAOOCTGCAG G6ATTCCAGA 840 

GCAGCCACOG GASCCQCGGG 0TAGQG6GGG CAGTGCC6GG GGCCGTCCTG GA6CCAGTGG 900 

CTCiSAGOGCC ATCTGTGCGC AGCCTCA6CC TCAGCCTG6C TGACTCGGGC CACCTGCC3G 960 

ACGTGCATGG GTTCAACAGC TAiOGGTAGCC ACCGAACCCT GCRGAGACTC AGCAGOSQTT 1020 

TTGATGACAT TGACCTGCCC TCAGCAGTCA AGTACCTCAT GGCTTCAGAC CCCAACCTGC 1080 

AG G TOCTG G G AGOGOCCXAC AT0CA6CACA AGT6CTACAG OGATGCACCC GCCAMAAGC 1140 

A6GCCC6CAG CCTTCAGGCC GT6CCTAGGC TGGTGAA6CT CTTCAACCAC 6CCAACCAGG 1200 

AAOTOCAOCG CCAT6CCACA GGTGCCATGC GCAACCTCAT CTACGACAAC GCTGACAACA 1260 

AQCTQQCCCT GGTGGAGGAG AACGGGATCT TOGAGCTGCT GOSGACACTG OGGGAOCAGG 1320 

ATGATGAGCT TCGCAAAAAT GTCACAGGGA TCCTGTQGAA CCTTTCATCC AGCGACCACC 1380 

T6AAGQACCG CCT6GOCAGA GACAGGCTGG AGCAOCTCAC GGAOCTGGTG TTGAGOOCCC 1440 

TGTGGG6GGC TGGGGGTCCC CCCCTCATCC AGCAGAACGC CTC96AGG06 6AGATCTTCT 1500 

ACAAGQCCAC CGOCTTCCTC AGGAACCTCA GCTCAGCCTC TCAGGCCACT CGCCA6AA6A 1560 

TG03GGAGTG CCACGGGCTG GTGGACGCCC TGGTCACCTC TATCAACCAC GCCCTGGAOG 1620 

C6G0CAAATG 06A6QACAAG AGOGrGGAGA AOGOGGTGTG OBTGCTGGQG AACCTGTCCT 1680 

ACOGCCTCTA CGAOBAOATG CCGCCGTCCG OGCTGCAGOO GCTQGA6GGT OGGGGCOSCA 1740 

GGGACCIGGC GGGGGCGCOG CCGGQAGAGG T06TGG6CTG CITCACGC06 CAGA6CGGGC 1800 

GGCTOOQOQA GCTOCCCCTC OCOGCOGATG CGCTCACCTT OGOGGAGGTG TCCAAGGACC 1860 

CCAAOGQCXrr CGAGTGOCTG TGGAGCCCCX: AGATCGTGGG GCTGTACAAC CGGCTGCTGC 1920 

AGG6CT6CGA GCTCAACGGG CACAOGAOGG AGGOGGCOGC 0GG660QCTG CAGAA CATCA 1980 

0GGCAGG06A CC6CAG0TGG 6CGGGGGTGC T6A0GC8CCT GGGCCIGGAQ CAGGAGOGTA 2040 

TTCTGAACCC CCTGCTAGAC GQTGTCAOGA CC6C06ACCA CCA0CA6CTG CGCTCACTGA 2100 

CTG G CCTCAT COSAAACCTO TCTCGGAACG CTAGGAACAA G(310GAGATG TCCACGAAGG 2160 

TOGTGAQCCA CCTGATCGAG AAOCTGCCAG GCAGCGTGGG TGAGAAGTOG CCCCCAGCCG 2220 

AGGT6CTGGT CAACATCATA GCTGTGCTCA ACAACCTGGT GGTGGCCAGC CCCAT08CTG 2280 

CCCGA6ACCT GCTGTATTTT GAOGGACTCC GAAAOCTCAT CTTCATCAAQ AMSUU300QG 2340 

ACAGCCCCGA CAGTGAGAAG TCCTCCCGGG CAGCATCCAG CCTCCTGGCC AACCTGTGGC 2400 

AGTACAACAA GCTCCACCGT GACTTTCQGQ 06AAGGGCTA TC6GAAG6A6 6ACTTCCTX3G 2460 

GCCCATAGGT GAAGCCTTCT GGAQGAGAAO GTGACGTGGC CCAGCGTCCA AGGGACAGAC 2520 

TCAGCTCCAG GCTGCTTGGC AGCXCAGCCT GGAGGAGAAG GCTAATGAOG GAG6G6CCCC 2580 

TCGCTGGGGC CCCTGTGTGC ATCTTTGAGG GTCCTG6QCC ACCA06A0Q0 GGAOGGTCTT 2640 

ATAGCTGG6G ACTT6GCTTC CGCAGGGCAG GGGGTG6GGC AGGGCTCAAO G CTGCTCT SS 2700 

TGTATGGGGT G6TGA0CCA0 TCACATTGGC AGA6GI6G60 GTT06CTGT0 GCCTQGCAGT 2760 

ATCTTGGGAT AGCCAGCACT GGGAATAAAO ATGGCGATOA ACAGTCACAA AAAAAAAAAA 2820 
AAAAG6AATT C 

Seq ZD NO: 351 Protein sequence 
Protein Accession ft: NP_009114.1 

1 11 21 31 41 51 
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I I I I 

MQDGiTPLLSA LQPEACVCSL ALPSDLQU>R 
PRHKGAAEPH PEAETAEGTS RGQYHTIiQAG 
SaSAVDLSCS RRLSSAHNG6 SAF6AAGYGG 
LRSLRLGPGG LDDRYSLVSE QLEPAATSTY 
RTZRAPAVRT LQRFQS5BRS RGVGGftVPGA 
NSYGSHRTLQ RLS5GPDDI0 LPSAVKYLMA 
QAVPRLVKLF NHANQEVORH ATGAMRNLZY 
XNVTGILWNL SSSOHLKDRL ARDTLEQLTD 
FLRNIiSSASQ ATRQKMHBCH GLVDALVTSZ 
EKPPSALQRL BGBGRRDIAG A PPGBW GCP 
WLHSPQIVGL YIIRLLQ&CEL NBHTrSAAAG 
LDRVRTADHH QLRSLTGZiIR NLSRNARNiCD 
ItKVhtQULW ASPIAARDLL YPDOIiRKLIF 
HRDFRAKGYR KEDFL6P 



Seq ID HO: 352 DMA sequence 
Nucleic Acid Accession ftt H31469 
Coding sequence: 1-651 



PCT/US02/12476 



FSSRSQ(a[*S6 
AQPTPPMPTR 
RAFAYERQAS 
VIiEPVARAPS 
SDPiniQfVI.GA 
ONAONKLALV 
LVLSPLSGA6 
NHAItDAGKCB 
TPQSRRLREL 
ALONITAGDR 
ENSTKWSBL 



LRAARVQEQV 
DKTSGPRPIA 
PVSFKBRGGV 



RARLU2XX3QQ 
KPAYSFA6HS 
GSRADYDTIiS 
WPEATEVSPS 



VRSLSLSLAD 
AYIQEKC7SD 
EEKGZFELLR 
GPPLIQQNAS 
DRSVQIAVCV 
PXAASALTFA 
RNAGVLSSLA 
ZEKLPGSVGB 
EKSSRAA8SL 



AAAKXQARSL 
TLREQODSLR 
BAEIFYKATG 
LRNLSYRLYD 
EVSRDPXBLB 
ZiBQBRZUIPL 
XSPFAEVLVK 
LANLNQYHKL 



ATGGCTGOSC 
ACTGGAAAAA 
GCCACCTT6G 
TTCAATC3TAT 
ATCCAAGCCC 
GTGCCTAACT 
6GCAACAAAG 
AAGAAGAATC 
TTCCTCTGGC 
GCTCTGGCOC 
TXAQAGGTTQ 



11 

I 

AGOGAGAGCC 
CGACCTTOGT 
GTGTTGAG6T 
GGGACACAGC 
AGTGTGCCAT 
GGCATAGAGA 
TGGATATTAA 
TTCAGTACTA 
TTGCTAGOAA 
CACCAGAAGT 
CTCAGACAAC 



21 
I 

CCAGGTCXIAG 
GAAACGTCAT 
TCATCCCCTA 
CGGCCAGGAG 
CATAATGTTT 
TCTGGTAOGA 
GGACAGGAAA 
OGACATTTCT 
OCTCATTGGA 
TGTCATGGAC 
T6CTCTCCOG 



31 
I 

TTCAAACTTG 
TTGACTGGTG 
GTGTTCCACA 
AAATTCGGTG 
GATGTAACAT 
GT6TGTGAAA 
GTGAAGGCGA 
GGCAAAAGTA 
GACCCTAACT 
GCAGCTTTGG 
GATGAGGATG 



41 
I 

TATTGGTTGG 

aatttga6aa 
cx:aacagagg 
gactgagaga 

CGAGAGTTAC 
ACATCCCCAT 
AATCCATTCT 
ACTACAACTT 
TGGAATTTGT 
CA6CACAGTA 
ATGAOCrGTQ 



51 
I 

TGATGGTGGT 
GAAGTATGTA 
ACCTATTAAO 
TGGCTATTAT 
TTACAAGAAT 
TGTGTTGTGT 
CTTCCACCGA 
TGAAAAGCCC 
TGCCATGOCT 
TGAGCAGQAC 
A 



Seq ID NO: 353 Protein sequence 
Protein Accession AAA36546 



11 



21 



31 41 51 

I I I i I I 

MAAQGBPQVQ FKLVLVGDGG TGKTTFVKRH LTGEFEKKW ATLGVEVHPL VFHTKRGPIK 

FNVHDTAGQE KFQGXiRDGYY IQAQCAIIMF OVTSRVTYKU VPIIWKRDLVR VCEKIPIVIjC 

(BIKVDIKDRK VKAKSIVFHR XRHLQYYDIS AXanNFBKP FLNLARXLIG DPNLBFVAMP 

ALAPPEWKD PALAAOVEBD LEVAQTTALP DEDDDL 



Seq ID NO I 354 DNA sequence 
NUcXeic Acid Accession #: NM_002B20 
Coding sequence: 304- B31 



1 

I 

0CG8TTGGCA 
OCCTGTTCCA 
CXntSTAAACA 
TTCAGAGGAA 
GTTTGGAGAA 
A06ATGCA6C 
GTGCCCTCCT 
6AACATCAGC 
CTTCACCATC 
CCTAACTCCA 
6AGG0CAGAT 
AAGACACCT6 
AAACGG06AA 
GACCACCT6T 
CTOGCCOGTA 
GCTTQGACAA 
CAGAGAATAA 
TGTCCTCCAS 
CATCAATCCT 
ATCTTCATAA 
TTCTTCAGT6 
GATATTATCT 
ACTTTTTATT 
TAAATTAT6T 
0CA6CTCATA 
UU ' m"l"lt.lt: 
COGTAGGAAA 



11 
I 

AAGAAGCTGA 
OSAAGOCAGO 
CACTACTTAT 
GCGCCTCTGA 
AGCACA6TT6 
GGAGACTGGT 
GCGGGOOCTC 

tcctccatga 
tgatogcaga 
agccx:tctcc 

ACCTAACTCA 
GGAAQAAAAA 
CTOGCTCTGC 
CTQACACCTC 
G0CTCAG06G 
AOCTAGAATT 
CTCAGAATAT 
CACCATAGAO 
TTACCACTCT 
TTT6CTGGAG 
TTTTTCATTT 
ACAAACACT6 
TAATTAAAT6 
TTTAAACACA 
CAAAAT AAAT 
ATOIATCTTT 
AATAAAACTT 



21 
I 

CTTCA6A00G 
A6AACTGCT0 

CATTGATGCA 

GAGTAGCOSG 
TCAOCAGTGO 
GGTGGA66GT 
CAAGGGGAAG 
AATCCACACA 
CAACACAAAG 
GGAAACTAAC 
GAAAGOCAAG 
CTGGTTAGAC 
CACAACGT06 
GGTGCTCTCA 
TTCTCOCTTT 
TGTCTQOCTT 
AGGOSCTAGA 
ACCAAATAAT 
AAGTGTATTT 
CTTAOSTTCT 
GA6AACAGCA 
TATTTAATTA 
TGCCTTAAAT 
GGTTTCTGAA 
TTGTTCATTO 
CACATTTAAA 



31 
I 

0GAAACT7TC 
GCCAGATTAA 
TATATAAAAC 
TTTTCCCTTT 
TTOCTAAATA 
AOOGTCGGGO 
CTCA6CC6CC 
TCC3^TCCAAG 
GCTGAAATCA 
AACCACCCCG 
AAlGGTOGftGft 
CCCGGGAAAC 
TCTGGAGTGA 
CTGGAGCTOG 
GCTGGSTTTT 
ATGTATCTCT 
AAA6CAGTAC 
GCCCATTCCT 
TTCATATTCA 
CTTCCCCTTA 
TTCAOTl'CAA 
TCRTGTCATA 
AATCTCAAAT 
TTCTTTAATT 
AATGTTTAA3 
GCAAOATGAA 
AAAAA 



41 

I 

TTCITTTAGG 
TTAGACATTO 
CATTTTATTT 
TTGCTCTTTC 
AGTCtXGAGC 
TGTTCCTGCT 
GCCTOUUUUS 
ATTTAOQGCG 
QAGCTACXrrC 
TCOQATTTGG 
OGTACAAAGA 
6CAAGGA6CA 
CTGGGAGTGO 
ATTCAOGGTA 
GGAGCCTCCC 
AT08ATTGT6 
COCCCTACCA 
CTTTCTCCAC 
AGCTTCAGAA 
CTCTCACACC 
GGGAGAATA7 
AA0GA7TCT6 
TTATTTTAAT 
AAATTTAACT 
TATTAACTTA 
ATAATTTTTC 



51 
I 

AGGCGGTTAG 
CTATGGGAGA 
T08CTATTAT 
TGOCTGTGTG 
G0GAGG6GA6 
GABCTA060G 
AGCTGT8TCT 
AOSATTCTTC 
GGAGGTGTCC 
GTCTGATGAT 
GCAQCOGCTC 
GGAAAAOAAA 
GCTAGAAGGG 
ACA6GCTTCT 
TTCTGOCTTG 
TAGCAATT6A 
CACACACOOC 
CGTCACCCAA 
GCTAOTGACC 
TGGGCAAACr 
AGAAGGATTT 
ASCCATTCAC 
GTAAAGAACT 
CTGGTTTCTA 
CAAGGATATA 
TAGGGTAATG 



Seq ID NO: 355 Protein sequence 
Protein Accession «t NM_002620 



60 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



31 



41 



51 



1 11 21 

I i I 1 I I 

KQRRLVQQWS VAVFLLSYAV PS0GRSVE6L SRRLXRAVSE HQLLHDKGKS IQDLRRRFFL 
BBLIAEIHTA EIRATSEVSP NSKPSFNTRN HPVRFGSDDB GRYLTQETNK VETYKEQPLK 



60 
120 
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TF6KKKKGKP GKRREQEKKK RRTRSAWLDS GVTGSGLBGD HLSDTSTTSL ELOSR 

Seq ID KO: 356 DKA sequence 
Kticleic Acid Accession 8: KM_017S22 
coding sequence: 1>2100 

X 11 21 31 41 51 

I } I I I 1 

ATGGGCCTCC CCGAGCCGGG CCCTCTCCXM CTTCTOGCGC TGCTGCTGCT GCT6CTGCTG 60 

CTGCTGCTGC TGCGGCTCCA GCATCTTG06 GOGGCAGCGG CTGATCC3GCT GCTCGGOGGC 120 

CAAGOGCOGG CCAA6GAGTG CGAAAAGGAC CAATTCCA6T 6CCGGAA06A G06CTGCATC 180 

CCCTCTGTGT GGA6AT6CGA CGAGGAOGAT GACTGCTTAG A0C3VCAGGGA OGAGGAOSAC 240 

TGCCCCAAGA AGACCTGTGC AGACAGTGAC TTCACCTGTG ACAAOGGCCA CT6CATCCAC 300 

GAAOGGTGGA AGTGTGACGO CGAGGAGGAG TGTCCTGATO GCTCCGATGA GTCCGAGGCC 360 

ACTTGCACCA AGCA6GTGTG TCCTGCAGAG AAGCTQA6CT GTGGACCCAC CAGCCACAA6 420 

TGTGTACCTG CCTOGTG6CG CTG06ACGGG GAGAAGGACT GC6AGGGTGG A6CQGATGAG 480 

6CGGGCTGT0 CTACCTCACT GGGCACCTGC CGTGGGGA06 AGTTOCAGTG TGGGGATGGG 540 

ACATGTGTCC TTGCAATCAA GCACTGCAAC CAGGAGCAGG ACTGTCCAGA TGGGAGTGAT 600 

GAAGCTGGCT GCCTACAGGG GCTGAACGAG TGTCTGCACA ACAATQGCGG CTGCTCACAC 660 

ATCTGCACTG ACCTCAAGAT TGGCTTTGAA TGCAOGTGCC CAGCAGGCTT CCAGCTOCTG 720 

GACCAGAAGA CTTGTGGCGA CATTGATGAG TGCAAGGACC CAGATGCCTG CAGCCAGATC 780 

TGTC7TCAATT ACAAGGGCTA TTTTAA6TGT GAGTCCTACC CTGGCXIGOGA GATGGACJCTA 840 

CTGAOCAAOA ACTGCAAG6C TGCTOCTGGC AAGAGCCCAT CCCTAATCTT CACCAACCGC 900 

AOGAGTOCGG AGGATCGACC TCT6AAGCGG AACTATTCAC GCCTCATCCC CATGCTCAAG 960 

AATGTCGTGG CACTAGATGT GGAA G TT GC C AOCAATOGCA TCTACTGGTG TGACCTCTCC 1020 

TACCGTAAGA TCTATAGCGC CTACATGGAC AAGGCCAGTG ACCOSAAAGA GOGGGAGGTC 1080 

CTCATTGAOG AGCAGTTGCA CTCTCCAGAG GGCCTGGCAG TGGACTGGGT CCACAAGCAC 1140 

ATCTACIGGA CTQACTOGGO CAATAAGACC ATCtCAGTGG OCACAGTTGA T66TGGC06C 1200 

OSAOGCACTC TCTTCAGOOG TAACCTCAGT GAACCCOGGG CCATOGCTOT TGACCC3CCTG 1260 

CGAGGGTTCA TGTATTGGTC TGACTGGGGG GACCAGGCCA A6ATTGAGAA ATCTGGGCTC 1320 

AACGGTGTGG ACOGGCAAAC ACTGGTGTCA OACAATATTG AATGGCCCAA OSGAATCACC 1380 

CTGGATCTGC T6AGCCAGGG CTTGTACTGG GTAGACTCCA AGCTACACCA ACTGTOCAGC 1440 

ATTGACTTCA GTaQAOGCAA CAGAAA6A06 CT6ATCTCCT OCA CTSACT T CCTGAGCCAC 1500 

OCTTTTGGGA TAOCTGTGTT TGAGGACAAG GTGTTCTG6A CAGACCTGGA 6AAGGAGGCC 1560 

ATTTTCAGTQ CAAATOGGCT (MTOGCCTG GAAATCTCCA TCCTGGCTGA GAACXTTCAAC 1620 

AACCCACATG ACATTGTCAT CTTCCATGAG CTGAAGCAGC CAAGAGCTCC AGATGCCTGT 1680 

GAGCTGAGTG TCCAGCCTAA TGGAGGCTGT GAATACCTGT GCCTTCCT6C TCCTCAGATC 1740 

TOCAGOCACT CTCCCAA6TA CACATGTGCC TOTCCXOACA CAATGTG6CT GGGTCCAGAC 1800 

ATGAAGAGGT 6CTAC06AGA TYSCAAATGAA GACAGTAA6A TGGGCTCAAC A6TCACT6CC 1860 

GCTGTTATOG GGATCATOGT GCCCATAGTG GTGATAGCCC TCCTGrGCAT GAGTGGATAC 1920 

CTGATCTGGA GAAACTGGAA GOGGAAGAAC ACCAAAAGCA TGAATTTTGA CAACCCAGTC 1980 

TACAGGAAAA CAACAGAAGA AGAAGATGAA GATGAGCTCC ATATAGGGAG AACTGCTCAG 2040 

ATTGGCCATG TCTATCCT6C A0GA6TGGCA TTAAGCCTTO AA6ATGATGG ACTACCCTGA 2100 

GGATGGGATC ACOCCCTTOO TGCCTCATG6 AATTCAGTCC CATGCACTAC ACTC08GATG 2160 

GTGTATQACT GGATGAATGC GTTTCTATAT ATGGGTCTGT GTGAGTGTAT GTGTGTGTGT 2220 

GATTTTTTTT TTTAAATTTA TGTTGOGGAA AGGTAACCAC AAAGTTATGA TGAACTGCAA 2280 

ACATCCAAAG GATGTGAGAG TTTTTCTATG TATAATGTTT TATACACTTT TTAACTGGTT 2340 

6CACTACCCA TGAGGAATTG GTGQAATOGC TACTGCTGAC TAACATGATG CACATAACCA 2400 

AATQGGQGCC AATGGCACAO TAOCTTACTC ATCATTXAAA AACTATATTT ACAGAAGATG 2460 

TTTGGTTGCT GGGGOGCTTT TTTAGGTTTT GG6CATTTGT TTTTTGTAAA TAASATGATT 2520 
AT6CTTTGT0 GCTATCCATC AACATAAGT 

Seq ID 210: 357 Protein sequence 
Protein Accession fti 1IP_059992 

1 11 21 31 41 51 

I I I I I I 

NGLPBP6PIA TiTATiTiTiTiTiTiTi LLIiLRLQHLA AAAADPLLOG QGPARSCBKD QFOOQIGRCI 60 

PSVHRCDEDD DCLDBSDEDD CPKKTCADSD FTCDliGHCIH ERMKCDGEEB CPDGSDESEA 120 

TCTKQVCPAE KLSCGPTSHK CVPASWRCDG EKDCEGGADB AGCATSLGTC RGDEFQGGDG 180 

TCVLAIKEQl QEQDCPDGSD EAGCLQGLNB CLHiniGGCSH ICTDLRIGFB CTCPAGFQIiL 240 

DQKTCGDIDB CKDPDACSQI CVimCGYFKC ECYPGCEMDL LTKHCKAAAO KSPSIiIFTNR 300 

TSAEDRPVKR NYSRLIPMLK HWAU3VBVA INRIYHCDLS YBKIYSAlfMD KASOPXEREV 360 

LIDEQLBSPB GLAVDHVHKH lYWTDStancr ISVATVDGSR RSTLFSRiniS EPRAIAVDPL* 420 

RGPMYKSDHO DQAKIEKSQL MGVDRQTLVS DtflEWFNGIT LDLLSQRLYK VDSKLHQLSS 480 

IDPSGGHRKT LISSTDPLSH PFGXAVFEDK VFWTDIiQJEA IPSAHRLKGL BISILAENU7 540 

NPBDIVZFHB LKQPRAPDAC ELSVQSKOGC EYLChPAPQl S5KSP1CYTCA CFDTKHLGFD 600 

HKRCYBDAMB DSKMGSTVTA AVIGIIVPIV VIALLCMSGY LINRNWXRRll TKSMNFDNPV 660 
YRKTTEBEDB DSLHIGRTAQ IGRVYPARVA LSItEDDGLP 

Seq ID KO: 358 DNA sequence 
Nucleic Acid Accession #: K27826 
Coding sequence: <l-503 

1 11 21 31 41 51 

I I i I I I 

AGCCCAAGAA ACATCTCACC AATTTCAAAT CTGATCTATT OGGCTTAGCG ACTQAAGATT 60 

GACGCTGCCC GATCGCCTOG GAAGTGCCCT GGACCATCAC AGAAGCOGAG CTTOOGOCAA 120 

CTCTCACAGT GGAGGGTAAG TCCATOOCCT GTTTAATOQA TAOSGGGGCT ACOCACTCCA 180 

CGTTGCCTTC TTTTCAAGGG CCTGTTTCCC TTGCCCCCAT AACTGTTGTG GGTATTGAOG 240 

GCCAAGCTTC AAAACCCCTO AAAACTCCCC CACTCTGGTG CCAACTTGGA CAACACTCTT 300 

TTATOCACTC TTTTTTAGTT ATCCCCACCT GCCC3VCTTCC CTTATTAOQC CX3AAATATTT 360 

TAACCAAATT ATCTGCTTCC CTGACTATTC CTGGAGTACA GCTACATCTC ATTGCTGCCX: 420 

TTCTTCCCAA TCCAAAGCCT CCTTTGTGTC CTCTAACATC CCCACAATAT CAGCXXHTAC 480 

CACAAQACCT CCCTTCAGCT TAATCTCTCC CACTCTAGGT TCCCACGCCQ CCCCTAATCC 540 

CACTTGAAOC AGCCCTGAQA AACATOGCCC ATTCTCTCTC CATACCACCC CCCAAAAATT 600 

TT06C06CTC CAAC3VCTTCA ACACTATTTT GTTTTATTTG TCTTATTAAT ATCAGAAGGC 650 



320 



wo 02/086443 

AGGAAIGTOl GGCCTCTGA6 <XCAGGGCAG GCCATOSC A T 00CCTGTG21C TTGCAOGTAT 730 

AOVTCCAGHT GGCCTGAAGT AJVCTGAAGAT CCACAAAAGA AGTAAAAACA GCCTTAACTG 7 BO 

ATGACATTCC ACCaTTGTGA TTTGTTCCTG CCCCACCCTA ACTGATCAAT GTACTTTGTA 840 

ATCTCCCCCA CCCTTAAGAA GGTTCTTTGT AATTCTCCXX: ACCCTTGAGA ATGTACTTTG 900 

TGAGATCCAC CCCTGCCCAC CAGAGAACAA CCOCCTTTGA TTGTAATTTT TTATXACCTT 960 

COCAAATOCT ATAAAACAGC CCCACOCCTA TCTTCCTTCA CIGACTCTCT TTTOGGACTC 1020 
AGCCACG06C AOOCAjGGTGA AATAAACAGC T P V A TRiCVC AC 

Seq ID NO: 359 Protein sequence 
Protein Accession 8: AAA6S999 

1 11 21 31 41 51 

I I I I I I 

PKKHLTNFKS DLPGLATEDW RCPIASBVPW TITBAELRVT LTVEGKSIPC LIDTGATHST 60 

liPSFQGFVSL APITWGIDG QASKPLKTPP LK0QL6QHSF MHSFLVZPTC PLPLLQRNIL 120 
TKL5ASLTIP GVQLHLIAAL LFHPKPPLCP LTSFOYOPLP QDLPSA 

Seq ID NO: 360 DNA sequence 
Nucleic Acid Accession #: NM_001B54 
Coding sequence: 162*5582 

1 11 21 31 41 51 

I I 1 I I 1 

AACCATCAAA TTTAGAAGAA AAAGCCCTTT GACTTTTTCC CCCTCTCCCT CCCCAATGGC 60 

TGTGTAGCAA ACATCCCT66 OGATACCTTG GAAAGGAOSA ASTTG6TCT0 CAGT06CAAT 120 

T TOBTG GG TT GAGTTCACAO TT GTGAO TGC GG6GCTG0QA GA TOGAO COS TGGTCCTCTA 180 

GQTGGAAAAC 6AAACGGTG0 CTCIGGGATT TCACOGTAAC AACCCTC6CA TTQACCTTCC 240 

TCTTCCAAGC TAGAGAGGTC AGAGGAGCTG CTCCAGTTGA TGTACTAAAA GCACTAGATT 300 

TTCACAATTC TCCAGAGGGA ATATCAAAAA CAACGGGATT TTGCACAAAC AGAAAGAATT 360 

CTAAAGGCTC AGATACTGCT TACAGAGTTT CAAAGCAAGC ACAACTCAGT GCCCCAACAA 420 

AACAGTTATT TCCAOGTGGA ACTTTCCCAG AAGACTTTTC AATACTATTT ACAGTAAAAC 480 

CAAAAAAAGG AATTCAGTCT TTCCTTTTAT CTATATATAA TGAGCATGGT ATTCAGCAAA 540 

TTGGTGTTGA GGTTGGGAGA TCACCTGTTT rrCTGTTTGA AGACCACACT GGAAAACCTG 600 

CCCCAGAAGA CTATCCCCTC TTCAGAACTG TTAACATOOC TGACGGGAAG TGGCATCGGG 660 

TAGCAATCAG CGTGGAGAAG AAAACTGTGA CAATGATTGT TGATTGTAAG AA6AAAACCA 720 

06AAACCACT TGATAGAAGT GAGAGAGCAA TTGTTGATAC CAATGGAATC ACGGTTTTTG 780 

GAACAAGGAT TTTGGATGAA GAAGTTTTTG AGGGGGACAT TCAGCAGTTT TTGATCACAG 840 

GTGATCCCAA GGCAGCATAT GACTACTGTG AGCATTATAG TCCAGACTGT GACTCTTCAG 900 

CACCCAA6GC TGCTCAA6CT CA6GAACCTC AGATAGATGA GTAT6CACCA GAGGATATAA 960 

TCGAATATGA CTATGAGTAT GGGGAA6CAG AGTATAAA6A GGCTGAAAGT GTAACAGA6G 1020 

GACCCACTGT AACTGAGGAG ACAATAGCAC AGACGGAGGC AAACATCGTT GATGATTTTC 1080 

AAGAATACAA CTATGGAACA ATGGAAAGTT ACCAGACAGA AGCTCCTAGG GATGriTCTG 1140 

GGACAAATGA 6CCAAATCCA GTTGAAGAAA TATTTACTGA AGAATATCTA AOGGQAGAGG 1200 

ATTATGATTC CCAGAGGAAA AATTCT6AGG ATACACTATA TGAAAACAAA 6AAATAQACG 1260 

GCAGGGATTC TGATCTTCTG GTAGAT6GAG ATTTAGGCGA ATATGATTTT TATGAATATA 1320 

AAGAATATGA AGATAAACCA ACAAGCCCCC CTAATGAAGA ATTTGGTCCA GGTGTACCAG 1380 

CAGAAACTGA TATTACAGAA ACAAGCATAA ATGGCCATGG T6CATATGGA GAGAAAGGAC 1440 

AGAAAGGAGA ACCAGCAGTO GTTGAGCCTQ GTATGCTTGT 08AA6GACCA GCftaGACCAG 1500 

CAGQACCTGC AGGTATTATQ GGTCCTCCAO GTCTACAAGQ CCCCACTG6A C0CCCTGGT6 1560 

ACCCTGGCGA TAGGGGCCCC OCAGGACOTC CTG6CTTACC AQOOGCTGAT GGTCTACCTG 1620 

GTCCTOCTGQ TACTATGTTQ ATGTTACOGT TCCGTTATGG TGGTGATGGT TCCAAAGGAC 1680 

CAACCATCTC T6CTCAGGAA 6CTCAGGCTC AA6CTATTCT TCAGCAGGCT CGGATTGCTC 1740 

TGAGAGOCCC A0CTG6CCCA ATGGGTCTAA CTGGAAGACC AGGTC C TG T G G6QGGGCCTQ 1800 

GTTCATCTGG GGCCAAAGGT 6A6AGTGGTG ATCCAQGTCC TCAGGGCOCT CGAGGOSTCC 1660 

AGGGTCCCCC TG6TCCAAC6 GGAAAACCTG GAAAAAGGGG TC6TCCAGGT GCAGATGGAG 1920 

GAAGAGGAAT GCCAGGAGAA CCTGGGGCAA AGGGAGATCX3 AGGGTTTGAT GGACTTCCGG 1980 

GTCTGCCAGG TGACAAAGGT CACAGGGGTO AACGAGGTCC TCAA66TCCT CCAGGTCCTC 2040 

CTG GIGA TGA TGGAATGA60 GGAGAAGATG GA6AAATTGG ACCAAGA6GT CTTOCAGGTG 2100 

AA6CTGGCGC ACGAGGTTTG CTGGGTCCAA GGGGAACTCC AGGAGCTCCA GGGCAGCCTG 2160 

GTATGGCA6G TGTAGATGGC CCCCCAGGAC CAAAAOGGAA CATOGGTCCC CAAGGGGAGC 2220 

CTGGGCCTCC AG6TCAACAA GGGAATCCAG GACCTCAGGO TCTTCCTGGT CCACAAGGTC 2280 

CAATTGGTCC TCCTGGTGAA AAAGGACCAC AAGGAAAACC AGGACTTGCT GGAC7TCCTG 2340 

GTGCTGATOG GCCTOCTGGT CATCCTGGGA AA6AAG6CCA GTCTGGAGAA AAGGGGQCTC 2400 

TG66TCCCCC TGGTCCACAA GGTCCTATTG GATNNCC6GG CCCCOGGGGA GTAAAGGGAG 2460 

CAGATGGTGT CAGAGGTCTC AAGGGATCTA AAGGTGAAAA GGGTGAAGAT GGTTTTCCAG 2520 

GATTCAAAGG TGACATGGG7 CTAAAAGGTG ACAGAGGAGA AGTTGGTCAA ATTGGCCCAA 2580 

GA6GGNAAGA 316GCCCTGAA GGACCCAAAG G7CGAGCA0G CCCAACTGGA GAOCCAGGTC 2640 

CTTCAGOTCA AGCAGGAGAA AAGGGAAAAC TTG6AGTTCC AGGATTACCA GGATATOCAG 2700 

GAAGACAAGG TCCAAAGG6T TCCACTGGAT TCCCTGGGTT TCCAGGTGCC AAT66AGAGA 2760 

AAGGT6CAC6 GGGAGTAGCT GGCAAACCA6 6CCCTCGGGG TCAGCGTGGT CCAAOGGGTC 2820 

CTGGAGGTTC AAGAGGTGCA AGAGGTCCCA CTGGGAAACC TGGGCCAAAQ GGCACTTCAG 2880 

GTGGOGATGG CCCTCCTGGC OCTCCAGGTQ AAAGAGGTCC TCAA6GACCT CAGGGTCCAG 2940 

TTGGArrOCC TGGAOCAAAA GGCCCTCCTO GACCACCAOG AAGGATG66C TGCCCAG6AC 3000 

ACCCTGGGCA AC6T6GGGAG ACTGGATTTC AAGGCAAGAC OGGCCCTCCT GGGCCAGGGG 3060 

GAGTGGTTGG ACCACAGGGA CCAACCOGTG AGACTGGTCC AATAQGGGAA CGTQGGTATC 3120 

CTGGTCCTCC TGGCCCTCCT GGTGAGCAAG GTCTTCCTGG TGCTQCAGGA AAAGAAGGTG 3180 

CAAAGGGTGA TCCAGGTCCT CAAG6TATCT CAGGGAAAGA 7G6ACCAGCA GGATTAOGTG 3240 

GTTTCCCAOG GGAAAGAG6T CTTCCTGGA6 CTCAGGGTGC AGCTGGACTG AAAGGAGGG6 3300 

AAGGTCCCCA GGGCCCACCA GGTCCAGTTG GCTCACCAGG AGAAGGTGG6 TCAGCAGGTA 3360 

CAGCPG6CCC AATTGGTTTA CGAGGGCGCC CGGGACCTCA OGGTCCTCCT GGTCCAGCTG 3420 

GAGAGAAAGG TGCTCCTGGA GAAAAAGGTC CCCAAGGOCC TGCAQQGAGA 6AIGGAGTTC 3480 

AAGGTCCTGT TG6TCTCCCA GGGCCAGCTG GTCCTGCCGQ CTCCCCTGGG GAAGAOGGAG 3540 

ACAAGGGTOA AATTGGTGAO 0G6QGACAAA AA06CAGCAA GGGTGGCAAa 66AGAAAATQ 3600 

6CCCTCCCG0 TCCCCCAG6T CTTCAAGGAC CAGTT6GTGC CCCTGGAATT GCT6GAGGT6 3660 

ATGGTGAACC AGGTCCTAGA GGACAGCAGG GGATGTTTGG GCAAAAAGGT GATGAGGGTC 3720 

CCAGAGGCTT CCCTGGACCT CCTGGTCCAA TAOGTCTTCA GGGTCTGCCA GGCCCACCTG 3780 

GTGAAAAAGG TGAAAATGGG GATGTTGGTC CATOGGGGOC ACCIGGTCCT CCAGGOCCAA 3840 
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GAGGCCCrCA A66TCCCAAT GGAGCTGIITG GhCC31CAAG6 ACCCCCAG6T TCTGTTGGTT 3900 

OVGTTGGTGG TGTTGGAGAA AAGGCmSAAC CTG6AGAAGC AOGAAACCCA 6GGCCTCCT6 3960 

GGGAAGCAGG TGTAGGCGGT CCCAAAGGAG AAAGAGGAGA GAAAGGGGAA GCTGGTCCAC 4020 

CTGGAGCTGC TGGACCTCX:A GGTGCCAAGG GGCOGCCAGG TGATGATGGC CCTAAGGGTA 4080 

ACXXXX3GT0C TGTTGGTTTT CCTOGAGATC CTGGTOCTOC TCGGGAACTT GGCCCTGCAO 4140 

GIGAAGATGO T OTl W it SS T GACAAQGGT6 AAGATGGAGA TCCT6GTCAA OOGGGTCCTC 4200 

CTGGCCCATC TGGTGA6SCT GCCOCACCAG GTCCTOCTGG AAAACGAGGT CCTCCTGGAO 4260 

CTGCAGGTGC AGAGGGAAGA CAAGGTGAAA AAGGTGCTAA GGGOGAAGCA GGTGCA6AA6 4320 

GTCCTCCT6G AAAAACOGGC CCAGTOGGTC CTCAGGGACC TGCAGGAAAG CCTGGTCCAG 4380 

AAGGTCTTOG GGGCAICCCT QGTCCTGTG G GAGAACAAGG TCTCCCTGGA GCTGCAGGCC 4440 

AAEATGGACC ACCTOGTOCT ATGG6ACCTC CTG6CTTA0C TGGTCTCAAA GGTGACCCTO 4500 

GCTCCAAGGQ TGAAAAGGGA CATCCTG6TT TAATTGGCCT GATTGGTCCT CCAGGAGAAC 4560 

AAG66GAAAA AGGTGACOGA GGGCTCCCTG GAACTCAAGG ATCTCCAGGA GCAAAAGGGG 4620 

ATGGGGGAAT TC3CTGGTCCT GCTQGTCCCT TAGGTCCACC TGGTCCTCCA GGCTTACCAG 4680 

GTCCTCAAGG CCCAAAGGGT AACAAA6GCT CTACXGGACC OGCTGGCCAG AAA GGTGA CA 4740 

GTGGTCTTCC AGGGCCTCCT GGGCCTCCAG GTCCACCTGO TGAAGTCATT CAGOCTTTAC 4800 

C31ATCTTGTC CTCCAAAAAA AOGAGAAGAC ATACTGAAGO OITOCAAGCA GATGCAGATG 4860 

ATAATATTCT TGATTACTCG GATGGAATG6 AAGAAATATT TGGTTOCCTC AATTCCCTGA 4920 

AACAAGACAT 0GAGCATAT6 AAATTTCCAA TGGGTACTCA GACCAATCCA GCCCGAACTT 4980 

GTAAAGACCT GCAACTCAGC CATCCTGACT TCCCAGATGG TGAATATTGG ATTGATC CTA 5040 

AGCAAGGTTG CTCAGGAGAT TCCTTCAAAG TTTACTGTAA TTTCACATCT GGTGOrGAGA 5100 

CTTGCATTTA TCCAGACAAA AAATCTGAGG GAGTAAGAAT TTCATCATGG CCftAAOGAGA 5160 

AA0CAG6AAG TXOOTTTAOT GAATTTAAGA GGGGAAAACT GCTTTCATAC TTA6AT6TTG 5220 

AAOGAAATTC CATCAATATG GTGCAAATGA CaTTCCTGAA ACTTCTGACT GCCTCTGCTC 5280 

GGCAAAATTT CACCTACCAC TGTCATCAGT CAGCAGCCTO GTATGATGTG TCATCAGGAA 5340 

GTTATGACAA AGCACTTCGC TTCCTGGGAT CAAATGATGA GGAGATGTCC TATGACAATA 5400 

ATCCTTTTAT CAAAACACTG TATGATGGTT GTAOGTCCAG AAAAGGCTAT GAAAAAACTG 5460 

TCATTGAAAT CAATACACCA AAAATTGAtC AAGTACCTAT TGtTGATGTC AtGATCAGTG 5520 

ACTTTG6TGA TCAGAATCAC AA G TTOSGAT TTGAAGTTGG TCCTGTTTGT TTTCTTGGCT 5580 

AAGATTAAGA CAAAGAACAT ATCAAATCAA CAGAAAATGT ACCTTGGTGC CACCAACCCA 5640 

TTTTGTGCCA CATGCAAGTT TTGAATAAGG ATGTATQQAA AACAAOGCTG CATATACAGG 5700 

TACCATTTAO GAAATACOGA TGCCTTTGTG GQGGCAGAAT CACAGACAAA AGCTTTGAAA 5760 

ATCATAAAGA TATAAGTTQG TOTGQCTAAG ATG6AAACAG GGCTGATTCT TGATTCCCAA 5820 

TTCTCAACTC TCCTTTTCCT ATTTOAATTT CrTTGOTOCT GTAGAAAACA AAAAAAGAAA 5880 

AATATATATT CATAAAAAAT ATGGTGCTCA TTCTCATCCA TCCAGGATGT ACTAAAACAG 5940 

TC3TGTTTAAT AAATTGTAAT TATTTTGT G T ACAGTTCTAT ACTGTTATCT GTGTCCATTT 6000 

CCAAAACTTG CACGTGTCCC TGAATTCOSC TGACTCTAAT TTATGAGGAT GCOGAACTCT 6060 

6ATGGCAATA ATATATGTAT TATGAAAATG AAGTTATGAT TTCC6ATGAC CCTAAGTOOC 6120 
■ mfm ' GOT TAATQATQAA ATTCCTTTQT GTGTCTTT 

Seq ID NO: 361 Protein sequence 
Protein Accession #: NP_001845 

1 11 21 31 41 51 

I I I I ) i 

MBPWSSRWKT KRWLWDFTVT TLALTFLPQA REVRGAAPVD VLKALDFHNS PEGISKTTGP 60 

CTNRKNSKGS DTAYRVSKQA QIiSAPTKQLP PGGTFPEDFS ILFTVKPKKG IQSPUiSIYN 120 

EKGIQQ26VS V6RSFVFLPE DBT6KPAPED YPZtFRTVHIA D6XHRRVAIS VEKKTVTMZV 180 

DCKKKTTKPL DR5ERAIVDT K6ZTVFGTRX U)KEVFE6DI QQFLXTGDPK AAYSyCBRYS 240 

PDCDSSAPKA AQAQEPQIDE YAPEDIIBYD YKYGEAEYKE AESVTBGPTV TBETIAQTEA 300 

NIVDDPQEYN YGTMESYQTB APRHVSGTNE PNPVEEIPTE BYLTGEDYDS QRKNSEDTLY 360 

QIKEIDGiCDS DLLVDGDLGB YOFYEYKEYE OKPTSPFNEB FGFGVPAETD ITETSINtSG 420 

AYGEK6QRGB PAWEPGMLV EGPPGPAGPA OZMGPPGLQG PTGPPGDPGD RGPPGRPOIiP 480 

GADGIiPGPPG TMU4LPPRYG QDGSKOPTZS AQSAQAQAZL QQARZALRGP P6PM0LT6RP 540 

GPVGGPGSSG AKGES(23PGP QGPRGVQGPP GPTGKPGKRG RPGADGGRGM PG&PGAXGDR 600 

GFDGLP6LPG DKGHROERGP QGPPGPPGDD OIRGEDGBIG PRGLPQBAGP RGLLGPRGTP 660 

GAPQQPGMAO VZXSPPGPRGM MGPQGEPGPP GQQCSIPGPQG LPGPQGPZGP PGEKGPQGKP 720 

aZiAGXiPOADO PPCTPGKEGQ 6GERGALGPP GPC2GPIGXPQ PRGVXGADGV RGIiXGSRGBK 780 

GEDGFPGFKG DM6LKSDRGE VGQIGPRGXD GPBSPKOtAO PT6DPGPSGQ AGEXBKU^VP 840 

GLPGYPGRQG PKGSTGFPGF PGANGEKGAR GVAGKPGPRG QRGPTGPRGS R GARGFT GKP 900 

GPKGTSGGDO PPGPPGKRGP QGPQCPVGFP GPKGPPGPPG RMGCPOPGQ RGETGPQGKT 960 

6PPGPQGW0 PQGPT GS TGP IGERGYPGPP GPFGEQGLPG AAGKBGAKGD P6PQGZS6KD 1020 

6PA0LRGFPO ERGLPGAQOA PGLKGGEGPQ GPFGPVGSPG ER6SAGTAGP IGLRGRPGPQ 1080 

6PPGPAGBKG APGERGPQ6P AGRDGVQ6PV QliFGPAGPAO SP6EDGDK6B IGEPGQKOSK 1140 

GGKGENGPPG PPGLQGPVGA PGIAGGDGHP GPRGQQQ1FG QK6DBGARGF PGPPGPIGLQ 1200 

GLPGPPGBKO ENGDVGPWGP PGPPGPRGPQ GPNGADGPQ6 PPGSVGSVQG VGEKGEPGEA 1260 

GNPGPPGBAO VQGPKGERGE K6BAGPPGAA GPPGAK6PPG DiDGPXGIlPGP V6FP6DPGPP 1320 

GELGPAOQDO VGGDKGEDGD PGQP6PPGPS GEAGPPGPPO XRGPPGAAGA EGRQGSKGAK 1380 

GSAGAEGPPG KTGPVGPQGP A6KP6PE6LR 6IFGPVGBQ6 LPGAAGQDGP PGFKGPPGLP 1440 

GliKGDPGSKO EKGHPGLIGI« IGFPGEQGEK GDSGLPGTQG SPGAKGD6GI PGPAGPLGPP 1500 

GPPGLPGPQG PKGNKGSTGP AGOKGDSGLP GPPGPPGPPG EVIQPLPILS SKKTRRHTEO 1560 

MQADADDNIL DYSDGMBBIF 6SU7SI.KQDI EHMKFPNGTQ TNPARTCaCDL QLSBPDFPDG 1620 

EYWIDPNQOC SGDSPKVYOr FTSGGETCIY PDKKSB6VRI SSHPKBKPGS WFSEPKRGKL 1680 

LSYIiDVEGNS ZNMVOMTFLK LLTASARQNF TYHCHQSAAW YSVSS6SYDK ALRFIiGSNDE 1740 

B4SYDimPFX KTLYDGCTSR IQBYEKTVZSI HTPKZDCIVPZ VDVKZSDFGD QNQSFQFEVG 1800 
PVCFU3 



Seq ID HO: 362 DKA sequence 
Nucleic Acid Accession #i 11M_0 03107 
Coding sequence i 351-1775 

1 11 21 31 41 51 

TTCCXXaGCA TTOGAGAAAC TCCTCTCTAC TTTAGCAOGG TCTCCAGACT CAQCCGAGAG 60 
ACAGCAAACT GCA60GCGGT 6AGAGAGOGA GAGAGAGGtA 6AGAGAGACT CTCXZAGCCTO 120 
GGAACTATAA CTCCTCTGC3B AGAGGG6GA6 AACTCCTTOC OCAAATCTTT T6GGSACTTT 160 
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TCICTCTTTA CCCACCTCGG 0CCCIG06AB OAGTTGAGGG GCCASTTOGG CCGCX360GG6 240 

OGTCTTCCOG TTa3606TGT GCTTGGCOOO GGGAACOGGG AgG GC COGQC GATGGOGOCSG 300 

CGGOOGCOGC GAGGGTGTGA G0GC3GCGTGO GCGCC0GCCX3 AGCOSAGGCC ATGGTGCftGC 360 

AAACCAACAA TGCCGA<SAAC AOSGAAGOSC TGCTGGCOGG OGAGAGCTCG GACTCGGGOG 420 

OOGGCCTOGA GCTGGGAATC GCCTCCTCCC CCACGCC06G CTC CA OOSCC TCXA0GG60G 480 

GCAAOGCOSA 06ACC0GA6C T6GTGCAAGA CCCXX3AGT66 GCACATCAAO OGAOCCATGA 540 

AOGOCTTCAT GGTGTG6TCG CAGATOBAGC GGGGCAAGAT CAT6GA6CA0 T06COOC3ACA 600 

TGCACAACGC CGAGATCTCC AAGCGGCTGG GCAAACGCTG GAAGCTGCTC AAAGACAGOO 660 

ACaAGATCCC TTTCATTCS3A GAGGCGGAGC GGCTGCGCCT CAAGCACATG GCTGACTACC 720 

C06ACTACAA 6TAC0GGCCC A6GAAGAACG TXSAAGTCOGG CAAC6CCAAC TCCAQCrCCT 780 

OQGCOGCCGC CTCCTCCAAG CCGCGGGAGA AGOGAGACAA 66TGQGTG6C AQTOQOQOGO 840 

60G6CCATG6 GGGCGGOSGC GGGOGGGGOl 6CAGCAA0GC GGG6GGA6QA GGOGGOGGTG 900 

CGAGTOGCGG CGGCGCCAAC TCXAAACG6G OCSCAGAAAAA GASCTGOG G C TCXAAAGTGG 960 

OGGGGGGOGC 6G6CGGTG60 GTTAGCAAAC 0GCA06CCAA GCTCATCCTG GCAGGCGGOG 1020 

G06G0GGCG6 GAAAGCAC06 GCT GC OGCCG CC36CCTCCTT 06C0GC0GAA CAGGG6GGGG 1080 

COGCCOCCCT GCTGCCCCTG OGCXiCOGCOS COGAGCACCA CTOGCTSTAC AAGGOQOQGA 1140 

CTCCCAGOGC CTOGGCCTCC GCCTCCT0G6 CAGCCTCX36C CTC06CA606 CTOSOGGCCC 1200 

C!GGGCAAGCA CCTGGCX»3AG AAGAAGGTGA AGOGCGTCTA CCTGTTOGGC GGCCTGGGCA 1260 

OGTCGTCGTC GCCCGTGGGC GGCGTGGGCG OQGGAGCCXSA CCCCAGOGAC CCCCTGGGCC 1320 

TGTAOQAGGA GGAGG6CX3CG GGCTGCTCGC COGA0G06CC CAGCCTGAGC GGC0GCA6CA 1380 

GCGC06CCTC GTCCCCC6CC GCCGGCOGCT OGCCGGCOGA CCACOGGQGC TAOSOCAGCC 1440 

TGCGCGCC3GC CTCGCCCXSCX: COGTCCAGCG OGCCCTCGCA OGOGTCCTOC TOGGCCTOGT 1500 

CCCACTCCTC CTCTTCCTCC TCCTCGGGCT CCTCXJTCCrc CGAOGACGAG TTOGAAGACG 1560 

ACCTGCTC6A CCTGAACCXX: AGCTCAAACT TTGAGAGCAT GTCCCTGGGC AGCTTCAGTT 1620 

OGTCGTCGGC GCTCGACOGG GACCTGGATT TTAACTTCGA GCCOSGCTCC GGCTOGCACT 1680 

TCX5AGTTCCC GGACTACTGC AC3GCC0GAGG TGAGCX5AGAT GATCTCGGGA GACTGGCTOQ 1740 

AQTCCAGCAT CTCCAACCTG^GTTTTCACCT ACTGAAGGGC GCGCAGGCAG GGAGAAGGGC 1800 

OGGGQGGGGT AGGAGAGGAG'aAAAAAAAAG TGAAAAAAA6 AAAOGAAAAO GACAGAOGAA 1860 

GAGTTTAAAG AGAAAAGGGA AAAAAGAAAG AAAAAGTAAG CAGGGCTCGT TOGCCOGOGT 1920 

TCTCGTCXrrC GGATCAAGGA GCGOGGCGGC GTTTTGGACC OGOSCTCCCA TCCCCCACCT 1980 

TCCCGGGCC6 GGGACCCACT CTGCCCAGCC GGAGGGA06C 6GAGGAGGAA GAG6GTAGAC 2040 

AGGGGC6ACC TGTGATTQTT GTTATTGATG TTGTT6TTGA TG6CAAAAAA AAAAAGOGAC 2100 

TTOOAGTTTQ CTGCCCTTTG CTTGAAGAOA COTCCTCCOC CTT CCAA OGA 6CTT00GGAC 3160 

TT8TCTGCAC CCX!CAGCAAG AAGGOGAGTT AGTTTTCTA6 AGACTTGAAG GAGTCTCCCC 2220 

CTTCXTTGCAT CACCACCTTG GTTTTGTTTT ATTTTGCTTC TTOGTCAAGA AAGGAGGGGA 2280 

GAACCCAGC3G CACCCCTCCC CCCCTTTTTT TAAACGCGTO ATGAAGACAG AAGGCTCCGG 2340 

GGTGACGAAT TTG6CCGATG GCAGATGTTT TGGGGGAAOCS COGGGACTGA GAGACTCCAC 2400 

OCAGGCQAAT ItXCUm m GGCCTTTTTT TCCTCCCTCT TTTOCCCTTG CCXCCTCIGC 2460 

A6CCG6AGGA GGAGATGTTG AGGG6A6GAG GCCACCCA8T GTGAGOQGOS CTAGQAAATO 2520 

ACCCGAGAAC CCCGTTG6AA 6CGCAGCAGC GGGAGCTAGG GQCGGGGGCG GAGGAGGACA 2580 

OGAACTGGAA GGGG6TTCAC GGTCAAACTG AAATGGA7TT GCACGTTGGG GAGCTGGCGG 2640 

0GG0G6CIGC TGGGCCTCCG CCTTCTTTTC TACGTGAAAT CAGTGAGGTG AGACTTCCCA 2700 

GACCCOGQAO GGQTGGAGQA GAGQAGACTG TTTGATCTGG TACAGGGGCA GTCAGTGGAG 2760 
GGOGAGTGOT TTCGGAAAAA AAAAAAGAAA AAAAGGG 

Seq ID NO: 363 Protein sequence 
Protein Accession Ut NP_003098 

1 11 21 31 41 51 

I I 11 I I 

MVQQTNNAQI TEALLAGESS DSGAGLEIiGI ASSPTPGSTA STGGKADDPS WCKTPSGHIK 60 

RPMMAFMVHS QIERRRIMEQ SFDMBNAEIS KRLGKRWKLL KDSDKIPPIR EAERLRLK8M 120 

ADYPDYKYHP RKKVKSGHAN SSSSAAASSK F6ERBDKVG6 SQOGGBGGGG GGGSSNAGGO 180 

GQQAS66GAN SKPAQKKSOS SKVA6GAGG6 VSKFRAKLIL ASGG0G6KAA AAAAASFAAB 240 

QAGAAALLPL GAAA0HH8LY KARTPSASAS ASSAASASAA LAAPGKHLAE KKVKRVYLFG 300 

GLGTSSSPVG GVGAGAOPSD PLGLYEEBGA GCSFDAPSLS 6RSSAASSPA AGRSPADHRG 360 

YASLRAASPA PSSAPSEASS SASSHSSSSS SSGSSSSDDB FEDDLLDLt^P SSNFBSMSL6 420 
SFSSSSALDR DU3FNPBF8S OSHPBFPDYC TPEVSENI86 DHLESSZStn. VFTY 



Seq 10 NO I 364 DNA sequence 
Hucleic Acid Accession fit U10860 
Coding sequence t 123-2204 

1 11 21 31 41 51 

I I 1 I 1 I 

TGCCGGCTGC TCCTCGACX» GGCCTCCTTC TCAACCTCAO 0C0G06606C C36ACCCTTCC 60 

GGCACCCTGC G6C0C08TCT OGTACIGTOO OOGTCAOOQC GQC66CTC06 GCCCTG G OCC 120 

0GATG6CTCT 0T6CAA0GQA 6ACTCCAAGC T6GAGAAIQC TG6AG6AGAC CTTAAG6ATG 180 

GCCACCACCA CTATGAAGGA GCTGTTGTCA TTCTGGATGC TGGT6CTCAG TACGGGAAAO 240 

TCATAGACOG AAGA6TGAGG GAACTGTT06 TGCAGTCTGA AATTTTCCCC TTGGAAACAC 300 

CAGCATTT6C TATAAA6GAA CAA6GATTCC GTGCTATTAT CATCTCTGQA OGACCTAATT 360 

CTGT6TATGC TGAAGATOCT CCCTGGTTTG ATCCAGCAAT ATTCACIATT GGCAA6CCTG 420 

TTCTTGGAAT TT6CTATGGT ATGCAGATQA T6AATAA6GT ATTTG6AGGT ACTGTGCACA 480 

AAAAAAGTGT CAGAGAAGAT GGAGTTTTCA ACATTAGT6T 06ATAATACA TGTTCATTAT 540 

TCA66QG0CT TCAGAAGGAA GAAGTTGTTT TGCTTACACA TGGAGATAGT GTAGACAAAG 600 

TA6CTGATOG ATTCAAGGTT GIGGCACXnT CT6GAAACAT AGTAGCAGGC ATAGCAAA7G 660 

AATCTAAAAA GTTATAT66A 6CAC3U3TTGC ACCCTQAAGT TG6CCTTACA QAAAATGGAA 720 

AAGTAATACT GAAGAATTTC CTT7ATGATA TAGCTGGAT3 CAGTGGAACC TTCAC0GT6C 780 

AGAACAGAOA ACTT6A0TGT ATTCGASAGA TCAAAGAGAG AGTAGGCAC3G TCAAAAOTTT 840 

TGOTTTTACT CAGTOGTGGA GTAGACTCAA CAGTTTGTAC AGCTTTGCTA AATOGTGCTT 900 

TGAACCAAGA ACAAGTCATT GCTGTGCACA TTGATAATG6 CTTTATGAGA AAACX3AGAAA 960 

GCCASTCTGT TGAAGA6GCC CTCAAAAAGC TT66AATTCA OGTCAAAOTG ATAAATGCTG 1020 

CTCATTCTTr CTACAATGGA ACAAOUVCCC TACCAATATC A6ATGAAGAT AGAACCCCAC 1080 

GGAAAAGAAT TAGCAAAAOG TTAAATATGA CCACAAGTCC TGAAGAGAAA AGAAAAATCA 1140 

TTGG6GATAC TTTTGTTAAG ATTGCCAATG AAGTAATTGG AGAAATGAAC TTGAAACCAQ 1200 

AGGAGGTTTT CCTTGCCCAA GGTACTTTAC G6CCTQATCT AATTGAAAGT GCATCCCTTG 1260 
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TT6CAAGTG6 CAAAGCTGAA CTOiTCAAAh 
AGTTGAGAGA GGAGGOAAM GTAATAGAAC 
GAATTTTGGG CAGAGAACTT C3GACTTCCAG 
GTCCTGGCCT GGCAATCA6A GTAATATGTG 
CT6AAACCAA CAAIATTTT6 AAAATAGTAG 
ATACCCTATT ACAGAOAGTC AAAGOCTtSCA 
AAATTACCAG TCTGCATTCA CTQAATGCCT 
AGGGTGACTG TOGTTCCTAC AGTTAOGTGT 
GGGAATCACT TATTTTTCTG GCTAGGCTTA 
TTGTTTATAT ATTTGGCCCA CXAGTTAAAG 
TGACAACAGG GGTGCTCAGT ACTTTAOOCC 
GGGAGTCTGO GTATGCTG6G AAAATCAGCC 
TtGATCX^GGA CCCACTTCAA AAGCAGCCTT 
TTATTACIAG TGACTTCATG ACTGGTATAC 
AGGTGglATT AAAGATGGTC ACTGAQATTA 
ATGACTTAAC ATCAAA6CCC CCAGGftACTA 

Seg ID MO I 365 Protein sequence 
Protein Accession fl: AAA6033X 

1 11 21 ' 31 41 51 

I I I I t I 

MAliOIGDSXL EMAGGDZiKDG HHHYBGAWI ZAAGAQYOKV IDRRVSELFV QSEXFPLETP 60 

AFAZKBQGFR AZIISGGPNS VYAEDAPHFD PAZFTI6KFV ZAXCYGMOKM NKVPGCTTVEK 120 

KSVREDGVFN ISVDNTCSLP RGIiQKEEWl. LTHGDSVDKV RDGFKWARS GKIVASIANB 180 

SKKLYGAQPH PBVGLTENGK VILKNFLYDI AGCSGTPTVQ URELECIREI KERVGTSKVL 240 

VLLSGGVDST VCTALLNRAL MQBQVIAVHI DNGFMRKRES QSVEEALKKL GIQVKVINAA 300 

HSPyMGTTTL p'ZSDEDRTPR KRISKTUIMT fSPEBKRICZI GDTFVKIANB VZGEMNLRPS 360 

EVFLAQGTLR PDLIESASLV ASGKAELIKT KHNDTBLIRX LREESKVIEP LKDFHKDBVR 420 

ILGRELGLPS ELVSRHPFPG PGLAIRVICA EEPYICKDPP BTNNILKIVA DPSASVKKPH 480 

TLLQRVKACT TEEDQEKU4Q ITSI^SLNAP LLPIKTVGVQ GDCRSYSYVC OISSKDBPDW 540 

BSLZPLARLZ PRMCHNV27RV VYZPGPPVKE PPTDVTPTPL TT6VLSTLRQ ADFEABtfllfR 600 

BSGYAGRISQ MFVILTPLHF DROPLQKQPS OQRSWIRTF ITSDFMTGIP ATPGNBIPVB 660 
WLKMVTBZK KIPGZSRIMY DLTSKPPGTT EWE 



Seq ID NO: 366 DNA sequence 
nucleic Acid Accession ffc NM_004219 
Coding sequence: 46-654 

1 11 21 31 41 51 

I 1 i I I I 

G06GCCTCAG ATGAAT6CXX3 CTCTTAAGAC CIGCAATAAT CCA6AATG6C TACTCTGATC 60 

TATGTTGATA A66AAAATQG AOAACCAOGC ACOOSTGTG G TT6CTAAG6A T6G6CTGAA0 120 

CTGGGGTCTG GACCTTCAAT CAAAGCCTTA GATGGGAGAT CTCAAGTTTC AACACCAOST 180 

TTTGGCAAAA OSTTCGATGC CCCACCAGCC TTACCTAAAG CTACTAGAAA GGCTTTGGGA 240 

ACI6TCAACA 6A6CTACAGA AAAGTCTGTA AAGACCAAGG GACCOCTCAA ACAAAAACAG 300 

CCAA6CTTTT CTGOCAAAAA GATGACItSAa AAGACTQTTA AAGCAAAAA6 CTCTGTTCCT 360 

GCCTCAGATO ATQCCTATCC AGAAATAGAA AAATTCTTTC CCTTCAATCC TCTAGACTTT 420 

GAGAGTTTTG ACCTGCXTTGA AGAOCACC3U3 ATTGOGCACC TCCOCTTGAG TGGAGTGCCT 480 

CTCATGATCC TTGACGAGGA GAGAGAGCTT GAAAAGCPGT TTCAGCTGGG CCCCCCTTCA 540 

CCTGTGAAGA TGCCCTCTCC ACCAT6GGAA TCCAATCTGT TGCAGTCTCC TTCAAGCATT 600 

CTQTCGACCC TGGATCTTGA ATTOCCACCT G 'lT mC l tflTa ACATAGATAT TTAAATTTCT 660 

TAGTGCTTCA GAGTTTGTGT GTATTPGTAT TAATAAAGGA TTCTTCAACA GAAAAAAAAA 720 
AAAAAAAA 

Seq ID NO: 367 Protein sequence 
Protein Accession «: KP_004210 

1 11 21 31 41 51 

I i } i I I 

KATLIYVDKE NGBPGTRWA KDGLKLGSGP SIKAUXSRSQ VSTPRFGKTF DAPP ALPKAT 60 

RKAIiGTVNRft TERSVKTKGP LKQKQPSFSA KKMTEKTVKA KSSVPASDDA YPEIEKFFPF 120 

NPLDFESFDL PEEHQIAHLP LSGVFLHILD EERHLEKLFQ liGPPSFVXMP SPFHESNXiLQ 180 
SPSSIXjSTU) VELPPVCCDI DI 



Seq ID NO: 368 DKA sequence 
Nucleic Acid Accession #: NM_000597 
Coding sequence: 118-1104 

1 11 21 31 41 51 

I I I I I i 

ATTCGQGGCO AGGGAGGAGG AAGAAGCGGA GGAGGCGGCT CCOGCTOOCA OQGCCGTGCA 60 

OCTCCCOGCC OGCCCGCT06 CTOGCTCGCC OGCOOOQCOS CQCTQCOGAC CGCCAGCATG 120 

CTGC0GAGA8 TGGGCTGCCC CGCGCPGCOG CTOCCQCCOC OGCCGCTGCT GCOGCTGCTG 180 

OOGCTGCTGC TGCT6CTACT GGGOGCGAGT GGCGGOGGCG GOSGGGCGOG OGCOGAGGTO 240 

CTGTTCCGCT GCCCGCCCTG CACACCCGAG CGCCTGGCXX; CCTGOGGGOC CCC8CCGGTT 300 

G06CCQCCCG CCGCGGTGGC OGCACTGGCC GGAGGOGCCC GCATGCCATO GG0G6A6CTC 360 

GTC0G6GAGC GGGGCTG066 CTGCT6CT0G GTGTQCGCCC GGCTGGAGGG CGAGGOGTGC 420 

GGOGTCTACA CCC0G0GCT6 C06CCAGGGG CTGCGCTGCT ATCCCCACCC GGGCTCCGAG 480 

CTGCCCCTGC AQOCGCTGGT CATGGGOGAG GGCACTTGTG AGAAGOGCCG GGAOGCCGAG 540 

TATGGCGCCA GCCOGGAGCA GGTTGCAOVC AATGGOGATG ACCACTCAGA AGOAGOCCTO 600 

GTG6AGAACC A06TGGACA0 CAOCATGAAC AT GT T GG GOG 6GGGAGGCAG TGCTGOCOGG 660 

AAGCCCCTCA AGTCGGGTAT GAAQGAGCTG GCOQTGTTCC GQ6AGAAGGT CACTGAGCAG 720 

CAC066CAGA TG66CAAGQG TG6CAAGCAT CACCTTGGOC TGGAGGAGCC CAAGAA6CT6 780 

06ACCACC0C CIGCCASGAC TOCCSGOCAA CAQGAACIGO ACCAGGTCCT 0SAG06GATC 840 



COCATCACAA T6ACACAGAG CTCATCAOAA 1320 

CTCTGAAA6A TTTTCATAAA 6AT6AAGTGA 1380 

AA6AGTTAGT TTOCAGGCAT CCAriTCCAG 1440 

CTGAAGAACC TTATATTTQT AAGGACTTTC 1500 

CTGATTTTTC TGCAAGTGTT AAAAAGCCAC 1560 

CAACAGAAGA G6ATCA6GAG AAGCTGAT6C 1620 

TCTTGCTGCC AATTAAAACT GT A GGT G TGC 1680 

GTGGAATCTC CAGTAAAGAT GAACCTGACT 1740 

TACCTCGCAT GTGTCACAAC GTTAACAGAG 1800 

AAOCTCCTAC AGATGTTACT CCCACTTTCT 1860 

AAGCXGATTT T6AG6CCCAT AACATTCTCA 1920 

A6ATGCCGGT GATTTTGACA CCATTACATT 1980 

CATGCCAGAG ATCTGTGGTT ATrOGAACCT 2040 

CTGCAACACC TGGCAATGAG ATCCCTGTAG 2100 

AGAAGATTCC TGGTATTTCT C6AATTATGT 2160 
CTGAGTGOGA GTAATAAACT TC 



324 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

TCX3UXATGC GCCTTCCGGA TGAGOGGGGC CCTCTGGAGC 
CCCAACTGTG AC3VAGCATGG CCTGTACAAC CTC3VAACAGT 
CAG OGTGGGG AGTG CTGGT G TCTGAA CCCC MVCAC06GGA 
ACCATCOGGG GGGACCCCQft GTQTCIVTCTC TTCTACAATO 
GTGOICACCC AGCGGATGCA GTAgACOGCA GCCA6CCGGT 
GCCCCTCTCC AAACACCGGC AGAAAACGGA GAGTGCTTGG 
TTCCAGrrCT GACACACGTA TTTATATTTG GAAAGAGACC 
CCOCSGCCTCT CTCTTCCCAG CTGCAGATGC CACACCTGCT 
GAGGAA6GGG GTT6TGGTCXS GGGAGCTGG6 GTACAGGTTT 
TTTATTTTTG AACCCCTGTG TCCCTTTTGC ATAAGATTAA 

Seq ID KO: 369 Protein sequence 
Protein Accession #: NP 000588 



ACCrCTACTC 
GCAAGATGTC 
AGCTGATCCA 
ftSCftGC AGGA 
6CCI0G06CC 
GTGGTGGGTG 
AGCACCX3A6C 
CCTTCTTGCT 
GGGGAGGGGG 
AGGAAGQAAA 



CCTGCACATC 
TCTGAACGGG 
666AGCCCXX 
GGCTTGCOGO 
OCTGOCCOCC 
CTGGAGGATT 
TOGGCACCTC 
TTOXOGGGG 
AASA6AAATT 
AST 



51 



1 
I 

•GGAACATGGC 
TTTGTAAT6C 
AGACAGCAAT 
CA6CACTGAT 
AAGAATCTAC 
AAGCTGCTAC 
AAAGOGCACT 
AGTCTCTTCC 
6TGCCATTAA 
TTAAACACTA 
GATAAGCTTA 
CSAGTGAAATT 
AATTCTGTTA 
C 



11 

I 

GGAT0CX3CTC 
CATTG6AGTA 
T3UVCAAAGAC 
TGCAOQAACA 
AGCTGCTTTA 
ATGTGT6GAG 
TGCT6ATATT 
AGACTCATAG 
GAATTCTQCA 
TGACACATTA 
TAAATCATGA 
ATTAAGGCAT 
TGACATAATT 



21 
I 

AGGCAGCTTC 
TTGCAGCAAT 
GAGCCAGCTA 
GCAAAAGACA 
CAGGCTGCTA 
GATGTTGTTT 
6CACAGTCAC 
CATCAGTGGA 
TCA6ACTTAQ 
CCTTTTTAGC 
TtGAATCAGC 
GTAATACATT 
TATGTCTCCA 



31 
I 

AGGAQ6CTGT 
GT GG TOCTOC 
ACCCTACAGA 
TTGATGTTTT 
GCTTGTATAA 
ATOGAGGAGA 
AGCTGAAGAC 
TACCATGT6G 
ATACAAGCCT 
TATTTTTAAT 
TTTAAAGCAT 
AATGAACATA 
TTTTGTTGTA 



41 

I 

GAATTOGCTT 
TGCCTCTTTC 
AGAGTATGCC 
GATAGATTCC 
GCTAGAAGAA 
CATGCTTCTG 
AAGAAGTGGT 
CTGAGAAAAG 
TACCAAC3UVT 
AGTCTTCTAT 
CATACCATCA 
ATATAAS6AA 
TTQGOCAGTA 



51 
I 

GCAGATCAGT 
AAT AATATT C 
CAGCTTTTTQ 
TTACCCAGTG 
GAAAACCATG 
GA6AAGATAC 
AOOCATAOCC 
AACTGTTT6A 
TACAGAAACA 
TTTCACTCTT 
TTTTTTAACT 
ACATATGTAA 
CTTTTACAAT 



Seq ID KO: 371 Protein sequence 
Protein Accession #i NP 004255 



11 



21 



31 



41 



51 
I 



MADRLTQLQD AVNSLADQFC NAIGVLQQCG PPASFHNIQT AINKDQPANP TEEYAQLFAA 
LZARTAKDID VLIDSLPSBE STAALQAASL YKXiBEENHBA ATCVEDWYR GDMLI£KZQS 
ALADIAQSQL KTELSGTHSQS LPDS 

Seq ID HO: 372 DKA sequence 

Nucleic Acid Accession ft: AJ271091 ^ 
Coding sequence: 1-1113 



1 

I 

ATGGAfiAATC 
CTGOGOGTGG 
CATTTCAAAG 
TTCTTAGACC 
ACAGTACAGA 
CTGTTTTTGG 
AGAGCZAAG6 
ACTCTTACAA 
TTCTCCTGGA 
TATGACACAT 
GAAACTATCA 
CTTCTTGGAA 
AAAOCTGTX^S 
TTCTACATGC 
CTX5TGGATTC 
ATTCCAATAT 
AAAGTTAGAT 
ATAAATTTTC 
CATGCCTGTG 



11 

I 

AGGTGTTOAC 
AGCTGAGTOA 
CTCAAGGACA 
TTGTGAAACC 
AGAAAGTGAG 
CTCCTGACTT 
AAGAAGAGCSS 
ACTTAAGGAA 
TCTTTGTCAA 
TOCATACTGT 
ATGCA6CAAT 
GAAATTTTAT 
TTTTCTTTGT 
T6A06TGCAT 
CCTTATATCC 
TCAATGAGAC 
TTTOCTTTTT 
GTCACCTTTA 
ATCOCAGOGC 



21 
I 

GCCGCATGTC 
CGTACAGAAC 
TGGTGCCAAA 
AGAGCCTGTT 
TCAGTGGTGG 
TGAT0GTT6G 
CCTAAATAAA 
AGGATACCTG 
CCTGACTGTG 
GGCTGACATG 
TGGAGTCACT 
TTTGTTTATC 
GTTTTATTTG 
TGACATGGAT 
ACTGGGATGT 
OGGAOSATTC 
TCTTCAGATT 
TAAACAGOGC 
TTTGGGAGGC 



31 

! 

TACTOQGCTC 
CCTGCCATCA 
GGAGACAATG 
TACAAACPGA 
GAGA6ACTCA 
CTGGATQAAT 
CTCOGACTGG 
TTTATGTATA 
OGATTCTGTA 
ATGTATTTCT 
AOGTCACCQO 
ATCTTTGGCA 
TGGAGTGCAA 
TGGAAG6T6C 
TTGGOGGAAG 
AGTTTCACAT 
TATCTTATAA 
AGACTGAAAA 
TCA 



41 

I 

AG0GACACC6 
GCATCACTGA 
TCTATGAATT 
CCCSUSAGOCA 
CAAAGCAGGA 
CTGATGCGGA 
AAAGCGAAG6 
ATCTTGTGCA 
TCTT6GGAAA 
GCCAQATGCT 
TQCTGCCTTC 
CCATGGAAGA 
TTGAAATTTT 
TCACATGGCT 
CTGTCTCAGT 
TGCCATATCC 
TGATATTTTT 
TGAGGGCAGG 



51 

I 

OGAGCTATAT 
AAAOGTGCTG 
TCACCT6GAG 
GGTAAACATT 
AAA6CGACCA 
AAT6GA6CTC 
CTCTCCTGAA 
ATTCTTGGQA 
AGAGTCCTTT 
GGCAOTTOTQ 
TCTGATCCAG 
AAT6CAGAAC 
CAGGTACTCT 
TCGTTACACT 
GATTCAGTCC 
AGTGAAAATC 
AGGTTTATAC 
CGCAGTGGCT 



Seq ID 130 1 373 Protein sequence 
Protein Accession »t CAB69070 



900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



PCTAJS02/12476 



1 11 21 31 41 

I i I I 1 I 

MLPRVGCPAL PLPPPPLI^PL LPLLLIiUiGA SOGGGGARAB VLFRCPPCTP ERLAACGPPP 
VAPPAAVAAV AGGARMPCAB LVREPGCGCC SVCARLEGSA OSVYTPRCGQ GLRCYPHPGS 
ELPLQALVKG BGTCSKRRDA EYGASPBQVA DNGDDHSE6G LVQIHVD5TM MMZiOGGGSAG 
RKPLK5C94KE LAVFREKVTB QHRQM6KGGK HELGLBSPKK LRPPPARTPC QQELOQfVLER 
ISTMBLPDER GPLERLYSLB IPHCDKBGLY MLKQCRNSU7 GQRGECWCVH PNTGKLIQGA 
PTIRGDPECH LFYHBQQ£AC GVRTQRMQ 

Seq ID MO: 370 HiHh sequence 
Kucleic Acid Accession #t KM_004264 
Coding sequence i 6-440 



60 
120 
180 
240 
300 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
660 
730 
780 



60 
120 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



1 11 21 31 41 51 

I 1 I 1 I i 

HE37QVLTPHV YWAQRHRBLY LRVELSDVQN PAISITOmi HFKAQGHGAK GDKWEFHLE 
FLDLVKPEPV YKLTQRQVNI TVQIOCVSQWW ERLTKQEKRP LPIAPDPDRW LDESDAH^EL 
RAREEBRUIK IiSLBSEGSPE TLTNLRXGYL FMViaLVQFLG FStflFVNLTV RFCILGKESF 



60 
120 
180 



325 
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YDTFHrVADM MYFCQMLAW ETINAAIGWT 
KAWFFVFYL WSAIBIFRYS FYML TCID MD 
XPIFHETGRF SFTLPYPVKI KVRFSFFLQZ 
HAC0PSALG6 



TSPVLPSLIQ UjGRNFILFI IFGSMBBNQH 240 
NKVLTHLRYT LWIFLYFLGC LABAVSVZQS 300 
YLIHIFLGLY IKFRELYKQR iOiKHRAaWA 360 



Seq ID NOi 374 DtCV sequence 
laucleic Acid Accession fts NMJ>16395 
Ooding sequence: 1-11I3 

1 11 21 31 41 51 

1 I i I I I 

ATGOAGAATC AGGTCTTGAC GCOGCATGTC TACTGGGCTC AGOGACACCG OGAGCTATAT 60 

CTC0GCX5TGG AGCTGAGTGA OSTACAGAAC CCTGCCATCA GCATCACTC5A AAAOCrTCCTG 120 

CATTTCAAAG CTCAAGGACZA TGGT6CCAAA GGA6ACAATG TCTATGAATT TCACCTG6AG 180 

TTCTTAGACC TTCTGAAACC AGAGCCTGTT TACAAACTGA CXrCAGAGGCA GG7AAACATT 240 

AGAGTACAGA AGAAAGTGAG TCAGTGGT6G 6AGAGACTCA CAAAGCAGGA AAAGOGACCA 300 

CTGTTrnGG CTCCTGACTT TGATCGTTGG CTGGATGAAT CTGATGOGQA AATQGAGCTC 360 

AGASCTAASG AAQAAQAGCG CCTAAATAAA CTCCGACTGG AAAGCGAAGQ CTCTCCTGAA 420 

ACTCTTACAA ACTTAAGGAA AGGATACXTPG TTTATGTATA ATCTTGTGC3V ATTCTT GGGA 480 

TTCTCCTGGA TCTTTGTC3UV CCTGACTGTG CGATTCTGTA TCTTGGGAAA AGAG TCCTTT 540 

TATGACACAT TCCATACTGT GGCTGACATG ATGTATTTCT GCCAGATGCT GGCAGTTGTG 600 

GAAACTATCA ATOCAQCAAT TGGAGTCACT ACGTCACCGG TGCTGCCTTC TCTGATCCAG 660 

CTTCTTGGAA QAAATTTTAT TTTGTTTATC ATCTTTGGC3V CCATQGAAGA AATGCA6AAC 720 

AAAGCTGTGG TTTTCTTTGT GTTTTATTTG TGGAGTGCAA TTGAAATTTT CAGGTACTCT 780 

TTCTACATGC TGACGTGCAT TGACATGGAT TGGAAGGTGC TCACATGGCT TaTTTACACT 640 

CTGTGGATTC CCTTATATCC ACTGGGATGT TTGGCGGAAG CTGTCTCAGT GATTCAGTCC 900 

ATTCCAATAT TCAATGAQAC OSGACGATTC AGTTTCACAT TGCCATATCC AGTGAAAATC 960 

AAAGTTAGAT r i ' iXX ' mTi ' TCTTCAGATT TATCTTATAA T6ATATTTTT AGGTTTATAC 1020 

ATAAATTTTG GTCAOCTTTA TAAACAGGGC A6ACTGAAAA TGA6GGCAG8 06CAGTGGCT 1080 
CATOCCTOTO A7CCCM3C6C TTTGGGAGGC TGA 



Seq ID HOt 375 Protein sequence 
Protein Accession #: NP_057479 

1 11 21 31 41 51 

I I I I I I 

MENQVLTPHV YWAQRHRKLY LRVELSDVQN PAISZTEUVL HFKAQGBOAK GDNVYCFKLE 60 

FLDLVKPEPV YKLTQRQVNI TVQKKVSQWW ERLTKQEKRP IiFLAPDPDRK LDESDAH4EL 120 

RAKEEERLNK LRLESBGSPB TLTNLRKGYL FMYNLVQETiG FSWIFVNLTV RFCILGKESP 180 

YDTFHTVADM MYFOQMLAW ETINAAIGVT TSPVLPSLIQ LLGRNFILFI IFGTMEEMQH 240 

XAWPPVPYIj WSAIEIPRYS PYMLTCIDMD WKVLTWLRYT LWIPLYPLGC LVBAVSVXQS 300 

IPIFKBTGRP 8FTl>PYPVECX KVRFSFFLQI YLZMIFLGLY ZNFRBLYKQR RRRYGKKRXR 360 
STKKKDLD6F LPV 

Seq ID NO: 376 DKA sequence 
Nucleic Acid Accession #i 13M_005987 
Coding sequence: 1-270 

1 11 21 31 41 51 

I I I I I I 

ATGAA7TCTC AGCAGCAGAA GCAGCCTT6C ACCCCACCCC CTCAGCCTCA GCAGCAGCAG 60 

GTGAAACAAC CTTGCCAGCC TCCACCCCAO GAACCATOCA TCCOCAAAAC CAAGGAGOCC 120 

TGCCAACCCA AGGTGCCTGA GCCCTGCCAC (XCAAAGTGC CTGAGCXXTO CCAGOCCAAG 180 

ATTCCAGAGC CCTGCCAGCC CAAGGTGCCT GAOCCCTGOC CTTCAACGGT CACTCCAOCA 240 
CCA6CCCAGC AQAAGACCAA GCAGAAGTAA 

Seq ZD HO: 377 Protein sequence 
Protein Accession NP_005978 

1 11 21 31 41 51 

I I I I I I 

HNSQQQXQPC TPPPQPQQQQ VKQPCQPPPQ EPCIPKTKEP OQPKVPBPCH PKVF5P0QPK 60 

ZPBPOQPECVP EPCPSTVTPA PAQQKTKQX 



Seq ID NO: 370 DNA sequence 
Nucleic Acid Accession ft: NM_002105, 
Coding sequence: 74-505 

1 11 21 31 41 51 

11)11 I 

ACAGCAGTTA CACTGCGGCG GGCGTCTGTT CTA0TGTTT6 AGCCGTOGTG CTTCAC066T 60 

CTACCTCGCT AGCATGTOGG GCGGOGGCAA GACTGQCGGC AAGGCOCGCG CCAAGGCCAA 120 

GTCGCGCTCG TCGOGCGCCG GCCTCCAGTT CCCAGTGGGC CGTGTACACC GGCTOCTGOQ 180 

GAAGGGCCAC TACGCOGAGC GCGTTGGCGC CGGOGCGCCA GTGTA0CTQ6 CGGCAGT6CT 240 

GGAGTACCTC ACOGCTGAGA TCCTGGAGCT GGOGGGCAAT OCQOCCOGCG ACAACAAGAA 300 

GAOGCGAATC ATCCCC06CC ACCTGCAGCT GGCCATCOGC AAOSAOGAGG AGCTCAACAA 360 

GCTGCTGGGC GGOGTGACGA TCGCCCAGGG AGGOGTCCTO CCCAACATCC A6GC06TGCT 420 

GCTGCCCAAG AAGACCAGOG CCACCGTGGQ GC06AAGGCG CCCTOGGGOG GCAAGAAQGC 480 

CACCCAGGCC TCCCAGGAGT ACTAAGAGGG CCCGCGCCGC GGCOGGCOGC CCCAOCTOCC 540 

CATGCCACCA CAAAGGCCCT TTTAAGGGCC ACCACCGCCC TGATGGAAAG AOCTGAGCOQ 600 

CrrCAGACTO OGGGOCAAQC GGGCC6CGGC TCCCTTCCCC TCCCCTCCOC TCGCOOGCCT 660 

TCGCO GO OOG GCCTG0A6TC CCOGCCCGCC 0CC6CTCCCG TCCCGCACCG CCTCCOGCGT 720 

06GCCTCGG6 CCTGOCCTGT OOGC OG TCO B CCCTCOGGTA GGGTTCGGGC CTTC CGGA TG 780 

OGGCTTGGGC GCTCTTCGGG GACCTCOGTO G0GCGGAA6A CCOGAGCCTG CCGGGGOGAQ 840 
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6C0G6CG60G O0GCAGCT6C C0GOCTC6GC GrroGTOACT CAG0060CCC ATCCCGAGTC 900 

6CTAAGG6GC TGGGGGGAGG COSCAGCAOC TTCTGGAAEJV CTT06CCTTC OGCTCTGACG 960 

CAGGGCCGAG GTGGGCAGTC CAGGCCGAGA GCCGGCGGCC CTGAAGGTGA GTGAG GCCC T 1020 

CGGCAGCTGC AGCCX3QGGTG TCTGGTACCC CCCCGGOGTG GTGCTTAGCC CAGGACTTTC 1080 

AGAOGGCOGC T GG 0C3SGGAG GCTTTGGTGG GAGAGACGCG ATOBCOGATT TOGGTCTOGC 1140 

GCCCCTTCTG OSGOOGGGAC CCWMCCTTT CACATCAGCT CTCCCTCOIT CTXmrPCAT 1200 

AGGTCXGC6C TGGG GC OGGG AOSAAGCACT TGGTAACAG6 CACATCTTCC TCCCGAGTGA 1260 

CTGCCTCCTA GGAGGACATT TAGGGQAGGG CAGAGGCCTG CAGTTTGGCT TCACGGCTGG 1320 

CTATGTGGAC AGCAAGAGTC GTTTTGCGGA ACGCGACTGG CAGCCAGGCC TGTOGGGCCC 1380 

C0GA06C0GC OCCATTTCCC TTCCAGOVAA CTCAACTCGG CS^TCCAAGC ACCTAGATAC 1440 

CA6CACAAOT OOGTTAATCC CTGT C TGGAC TCAGCCTCOG TTGGCTTCTG AACTGC5AATT 1500 

CTGCAGCTAA CCCTTCCACG ACXAGAACCT TA06CATTGQ GGAGTTTTAG ATGGACTAAT 1560 
TTTATTAAAG GATTGTTTTT TTTTT 



5eq ID NO: 379 Protein sequence 
Protein Accession St HP_002096 

1 11 21 31 41 51 

1 1 ! I I I 

MSGRGKTGGX ARAKAKSRSS RA6LQPPVGR VHBLLHKBHY AERVGAGAPV YLAAVLEYLT 60 
AEILELAGNA ARI3NKKTRZI PRHLQIAZBN DEBLHKLLG6 VTZAQGGVLP KIQAVUiPRK 120 
TSATVGPKAP SGGKKATQAS QEY 



Seq ID KOt 380 DNA sequence 
Kucleic Acid Accession AL136942 
Coding sequence: 184-864 

1 11 21 31 41 51 

I I I I I i 

ACGOGTCOGG CAGAAGCTCG GAGCTCTOGG GGTATCGAGG AGGCAGGCOC GCX3GG0GCAC 60 

GGGCGAGGGG GCCGGGAGCC GGAGCGG0G6 A6GAGCGGGC AGCAGOGGCG CX3G0GGGCTC 120 

CAOGGGAGGC GOTOQAOGCT CCTGAAAACT TGOGCGOGCG CTCGCGCCAC TGCGCCCGGA 180 

GOQATGAAGA TGGTCOCGCC CTGGACGOGG TTCTACTCCA ACAGCTGCTG CTTGTGCTGC 240 

CATGTCOGCA CCXSGCACTAT CXTTGCTOGGC GTCTGGTATC TGATCATCAA TGCTGTGGTA 300 

CTGTTGATTT TATTGAGTGC CCTGGCTGAT CCGGATCAGT ATAACTTTTC AAGTTCTGAA 360 

CT6G6AGGTG ACTTTGAGTT CATGGATGAT GCCAACATGT GCATTGCCAT TGCGATTTCT 420 

CTTCTCATGA TCCTQATATO TGCTATOOCT ACTTAOGGAG OGTACAAGCA AOGCGCAGCC 480 

TOGATCATCC CATTCTTCTG TTACCAGATC TTTGACTTTO COCTGAACAT GTTGGTTGCA 540 

ATCACTGTCC TTATTTATCC AAACTCCATT CAGGAATACA TAOGGCAACT GCCTCCTAAT 600 

TTTCCCTACA GAGATGATGT CATGTCAGTG AATCCTACCT GTTTGGTCCT TATTATTCTT 660 

CTGTTTATTA GCATTATCTT GACTTTTAAG GGrTACTTGA TTAGCTGTGT TTGGAACTGC 720 

TACOGATACA TCAATOQTAO GAACTCCTCT GATGTCCTGG TTTATeTTAC CAGCAATGAC 780 

ACTAOOGTGC TGCTACCCCC GTATGATGAT GCCACTGTGA ATGGT6CTGC CAAGG AGCC A 840 

CCX3CCACCTT ACGTGTCTGC CTAAGCCTTC AAGTGGGOSG AGCTGAGGGC AGCAGCTTGA 900 

CTTTGCAGAC ATCTGAGCAA TAGTTCTGTT ATTTCACTTT T6CCATGAGC CTCTCTGAGC 960 

TTGTTTGTTG CTGAAATGCT ACTTTTTAAA ATTTAGATGT TAGATTGAAA ACTGTAGTTT 1020 

TCAACATATG CTTTGCTAOA ACACTG16A7 AGATTAACIG TAGAATTCTT CCT8TA08AT 1080 

TOGGOATATA ACOOGCTTCA CTAACCTTCC CTAGGCATTG AAACTTCCCC CAAATCTQAT 1140 

GGAOCTAGAA GTCTGCTTTT GTACCTGCTG GGCCCCAAAG TTOGGCATTT TTCTCTCTGT 1200 

TCCCTCTCTT TTGAAAATGT AAAATAAAAC CAAAAATAGA CAACTTTTTC TTCAGCCATT 1260 

CCAGCATAGA GAACAAAACC TTATGGAAAC AGGAATGTCA ATTGTGTAAT CATTG TTCTA 1320 

ATTAOGTAAA TAOAAGTCCT TAT0TAT6TG TTACAAGAAT T TCOO CCACA ACATCCTTTA 1380 

TGACTQAAGT TCAATGACAG TTTGTGTTTO GTG6TAAAGG ATTTTCTCCA IGGO CTOA AT 1440 

TAAGACCATT AOAAAGCACC AGGCCGTGGG AGCAGTGACC ATCTACTGAC TGTTCTTQTG 1500 

GATCTTGTGT CCAGGGACAT GGGGTGACAT GCCTCGTATG TGTTAGAQGG TGGAATGGAT 1560 

GT6TTT6GCG CTGCATGGGA TCTGGTGCXr CTCTTCTCCT GGATTCACAT CCCCfXXCAG 1620 

GGCCCGCTTT TACTAAGTGT TCTGCCCXAO ATTGGTTCAA GGAOGTCATC CAACT SACTT 1680 

TATCAAGTGG AATTOGGATA TATTTGATAT ACTTCTGCCT AACAACATG6 AAAAO GGTTT 1740 

TCTTTTCCCT GCAAGCTACA TCCTACT6CT TTGAACTTCC AAGTATGTCT AGTCACCTTT 1800 

TAAAATGTAA ACATTTTCAG AAAAATGAGG ATTGCCTTCC TTGTATQCGC TTTTTACCTT IB 60 

GACTACCTGA ATTGCAAGGG ATTTTTATAT ATTCATATGT TACAAAGTCA GCAA CTCTC C 1920 

Iti'n'GG' f l' CA TTATTQAATG TGCTGTAAAT TAAGT08TTT OCAATTAAAA CAAGGTTTGC 19B0 
CCACATCCAA AAAAAAAAAA AAAAA 



Seq ID NO) 381 Protein sequence 
Protein Accession #i CAB66876 



1 11 21 

I 1 I 

MKMVAPWTRF YSNSCCLCCH VRTCTILLGV 
GGDFEFMDOA NMCIAIAISL LNZLICAMAT 
TVLZYPNSZQ EYZHQIiPPNP PYRISDVMSVH 
RYZNGRNSSD VLVYVTSMDT TVLLPPYDDA 



31 41 51 

1 I i 

WYLIINAWL LILLSALADP DQYNFSSSEL 60 
YGAYKQRAAH IZPFPCYOZF DFALNMLVAZ 120 
PTCLVLZZLL FZSIILTFKG YLZSCVMBCY 180 
TVN6AAKEPP PPYV8A 



Seq ID 382 DZIA sequence 
Nucleic Acid Accession #: NM_002510 
Cofling sequence! 92~1774 



1 11 21 31 41 51 

I ) I I i I 

CA6ATGCCAG AAGAACACTG TTGCTCTTOG TGGAOQGGCC CAGAGGAATT CAGAGTTAAA 60 

CCTTGAGT6C CTQO GT COGT GAGAATTCAQ CATQGAATGT CTCIACTATT TCCTGG6ATT 120 

'VCYGC r CCT G GC7GCAAGAT T6CCACTTGA T600GCCAAA CXSA7TTCATG AT6TGCT66G 180 

CAATGAAAGA CCTTCTGCrT ACATGAGGGA GCACAATCAA TTAAATGGCT GGTCTTCTGA 240 

TGAAAATGAC TG6AATGAAA AACTCTACCC AGTGTGGAAQ OGGGGAGACA TGAGGTGGAA 300 

AAACTCCTG G AAGGGAGGCC GT6T6CAGGC GGTCCTGACC AGTGACTCAC CAGCCCT06T 360 
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GGGCTCAAAT ATAACATTTG OGCHGAAOCT GATATTCCX7 ABIITGCCAAA AGGAAGATGC 420 

CAATGGCAAC ATAGTCTATG AQAAGAACTG CAGAAATGAG QCTGGTTTAT CTGCTGATCC 4 BO 

ATATGTTTAC AACTGGACAO CATOGfTCAGA GGACAGTGAC CSS3GAAAATG GCACCGGCCA 540 

AAGCCATCAT AAOGTCTTCX: CTGATGGGAA AOCTTTTCCT CACCACCCCG GATGGAGAAG 600 

ATGGAATTTC ATCTAOGTCT TCCACACACT TGGTCAGTAT TTCCAGAAAT TGGGAOGATG 660 

TTCAOTQAGA GTTTCIGTGA ACACAGOCAA TGTGACACTT GGGOCTGAAC TCAT6GAAGT 720 

GACTGTCTAC A6AAGACATG GACGGGCATA TGTTCCCATC GCACAAGTGA AAGATGTGTA 760 

CGTGGTAACA GATCAGATTC C TG l Xy iYl Xyr GACTAT6TTC CAGAAGAAOG ATOGAAATTC 840 

ATCOGACGAA ACCTTCXTCA AAGATCTCCX: CATTATGTTT GATGTCCTGA TTCATGATCC 900 

TAGCCACrrC CTCAATTATT CTAOCATTAA CTACAAGTGG AGCTTCGGGG ATAATACTG6 960 

cciVA ' nvrr tccaccaatc atactgtgaa tcacaostat gtgctcaatg gaaocttcag 1020 

CCTTAACJCTC ACTGTGAAAG CTGCAGCACC AGGACCTTGT CC3 G CC A C0GC CAOCACCACC 1080 

CAGA CCI T CA AAACCCACCC CTTCTTTAGG ACCTGCTGGT GACAACCCCC TGGAGCTGAG 1140 

TAGGATTOCT GATGAAAACT GCCAGATTAA CAGATATGGC CawnTTCAAG CXrACCATCAC 1200 

AATTGTAGAG GGAATCTTAG AGGTTAACAT CATCCAGATG ACAGACGTCC TGATGOOGGT 1260 

GCCATGGCCT GAAAOCTCCC TAATAGACTT TGTOGT G ACC T8CCAAGG6A GCATTOCCAC 1320 

GGAOGTCTGT ACXATCATTT CTGACCCC3VC CTGCGAGATC AOCCAGAACA CAGTCTGCAO 1380 

CCCTGTGGAT GTGCaiTGAGA TGTGTCTGCT GACTGTGAGA OGAACCTTCA ATGGGTCTGG 1440 

GAOGTACTGT GT6AACCTCA CCCTGGGGGA TGACACAAGC CTGGCTCTCA CGAGCACCCT ISOO 

GATTTCTGTT CCTGACAGAG ACCCAGCXrTC GCCTTTAAGG ATG6CAAACA GTGCCCTGAT 1560 

CTCOGTTGGC TGCTTGOCCA TATTTGTCAC TGTGATCTCC CTCTrGGTGT ACAAAAAACA 1620 

CAAGGAATAC AACCCAATAG AAAATAGTCC TGGGAATGTG GTCAGAAOCA AAGGCCTGAG 1680 

TGTCTTTCTC AACX»TGCAA AAGCCGTGTT CTTCXXX3GGA AACCAGGAAA AGGATCCGCT 1740 

ACTCAAAAAC CAAGAATTTA AAGGAGTTTC TTAAATTTCG ACCTTGTTTC TGAAGCTCAC 1800 

TTTTCAGTGC CAITGATGTG AGATGTGCTO GAGTGGCTAT TAACXTTTTTT TTCCTAAAGA 1860 

TTATTGTTAA ATAGATATTG TGGTTTGGGG AAGTTGAATT TTTTATAGGT TAAATGTCAT 1920 

TTTAGAGATG GGGA6AGGGA TTATACTGCA GGCAGCTTCA GCCATGTTGT GAAACT6ATA 1980 

AAAGCAACIT A6CAAG6CTT CTTTTCATTA TTTTTTAfCT TTCACTTATA AAGTCTTAGG '20'40 

TAACTAOTAG GATA6AAACA CTGTGTCCOC AGAGTAAGGA GAGAAGCTAC TATTGATTAG 2100 

AGCCTAACCC AGGTTAACTG CAAGAAGAGG CGGGATACTT TCAGCTTTOC ATGTAACTGT 2160 

ATGCATAAAG CCAATGTAGT CCAGTTTCTA AGATCATGTT CCAAGCTAAC TGAATCCCAC 2220 

TTCAATAC31C ACTCATGAAC TCCTGATGGA ACAATAACAG GCCCAAGCCT GTGGTATGAT 2280 

QTGCACACTT GCTAQACTCA GAAAAAATAC TACTCTCATA AAT66GT6GG AGTATTTTGG 2340 

TOACAACCTA CTTTGCTTOO CT6AGTGAAG GAAT6ATATT CATATATTCA TTTATTCCAT 2400 

GGACATTTAG TTAGTGCTTT TTATATACX» GGCATGATGC TGA6TGACAC TCTTGTGTAT 2460 

ATTTCCAAAT TTTTGTATAG TOGCTCCACA TATTTGAAAT CATATATTAA GACTTTCCAA 2520 

AGATGAGGTC OCTGGTTTTT CATGGCAACX TGATCAGTAA GGATTTCACC TCTGTTTGTA 2580 

ACTAAAACCA TCTACTATAT GTTAGACATG ACATTCTTTT TCTCTCCTTC CTGAAAAATA 2640 
AAQTQTO Q GA ASAGACAAAA AAAAAAAAA 

Seq ID KO: 383 Protein sequence 
Protein Accession #: KP_002501 

1 11 21 31 41 51 

1111)1 

MECLYYPLGP LLLAARLPU) AAKRFHDVLG NERPSAYMRB HNQLNGWSSD ENDWNEKLYP 60 

VWKRQ3MRWK NSHK66RVQA VLTSDSPALV GSt7ZTFAVNX< IFPRCQKEOA KGHIVYEKKC X20 

RMEAGLSADP YVYHmAWSS DSDGEMGTGQ SBBNVFPDGR PPPBHP6HRR NNFIYVPHTL 180 

GQYFQKLGRC SVRVSVNTAN VTLGPQLMBV TVYRRHGRAY VPIAQVOTVY WTDQIPVFV 240 

TMPQKNDRNS SDBTFLKDLP IMFDVLIHDP SHFLNYSTIN YKWSPGDNTG LFVSTNHTVN 300 

HTYVUJGTPS UrtjTVKAAAP GPCPPPPPPP RPSKPTPSLG PAGDNPLELS RIPDENCQIK 360 

RYGHFQATIT IVEGILEVNI IQMTDVIUPV PWPESSLIDF WTCQGSIPT BVCTIISDPT 420 

CSZTQNTVCS FVSVDEMCLL TVRSTFNGSG TyCVNLTLGD DTSZALTSTZi Z5VPZ3RDPAS 480 

PIiSHANSALI SVOOAZFVT VISLLVYRXH KEYMPXaiSP <33VVRSIpBLS VFUIRAXAVP 540 
FPGNQBKDPL I/KNQEFRGV8 

Seq ID NO: 384 OHA sequence 
Nucleic Acid Accession St NM_00li34 
Coding sequence i 48-1877 

1 H 21 31 41 51 

I t I i I I 

TCCATATTCT GCTTCCACCA CTGOCAATAA CAAAATAACT A6CAA0CATG AA6TGGGTG0 60 

AATCAATTTT TTTAATT TT C CTACTAAATT TTACT6AATC CAGAACACTG C ATAG AAATO 120 

AATATGGAAT AGCTTCCATA TTGGATTCTT ACCAATGTAC TGCAGAGATA AGTTTAOCTO 180 

ACCTGGCTAC CATATTTTTT GCCCAGTTTG TTCAAGAAGC CACTTACAAG GAAGTAAGCA 240 

AAATG6TGAA AGAT6CATT6 ACTGCAATTG AGAAACCCAC TGGA0AT6AA CA6TCTTCAG 300 

GGTOTTTAGA AAACCASCTA C Xri XA CC i 'TT C TGOAAGAACT TTGGCATSAG AAAGAAATTT 360 

T6GAGAAGTA G6GACATTCA GACTOCIGCA G0CAAAGT6A A6AGGGAAGA CATAACTGTT 420 

TTCTTGCACA CAAAAAGCCC ACTCXAGCAT OGATCCCACT TTTCCAAGTT CCAGAAOCTO 480 

TCACAAOCro TGAA6CATAT GAAGAAGACA GGGAGACATT CATGAAC3U\A TTCATTTATO 540 

AGATA6CAAG AAG6CATCCC TTCCTGTAT6 CACCTACAAT TCTTCTTT66 GCTGCTC6CT 600 

AT6ACAAAAT AATTCCATCT TGCTGCAAAG CTGAAAATGC AGTT6AATGC TTCCAAACAA 660 

AG6CAGCAAC A6TTACAAAA GAATTAA6AG AAAOCAGCTT GTTAAATCAA CATGCATGTG 720 

CAGTAATQAA AAATTTTOGG ACCOOAACTT TCCAAGCCAT AACTGTTACT AAACTGAGTC 780 

AGAAGTTTAC CAAAGTTAAT TTTACTGAAA TCCAGAAACT AGTCCTGGAT GTGGCCCATG 840 

TACATGAGCA CTGTTGCAGA GGAGATGT6C TGQATTGTCr OCAGGATGGG GAAAAAATCA 900 

TGTCCTACAT ATGTTCTCAA CAAGACACTC TGTCAAACAA AATAACAGAA TGCTGCAAAC 960 

TGACCAOGCT GGAACGTGGT CSU^TGTAXAA TTCATGCA6A AAATGAT6AA AAAC CTGAAG 1020 

GTCTATCTCC AAATCTAAAC AGOTTTTTAG GAGATAGAGA TTTTAACCAA TTTTCTTCAG 1080 

GGGAAAAAAA TATCTTCTTG OCAAGTTTTG TTCATGAATA TTCAAGAAQA CATCCTCAGC 1140 

TTGCTOTCTC AGTAATTCTA AGAGTTGCTA AAGGATACCA GGAGTTAT7G GAGAAGTGTT 1200 

TCCAGACTGA AAACCCTCTT GAATGCCAA6 ATAAAGGA6A AGAAOAATTA CA6AAATACA 1260 

TCCAGGAGAG CCAAGCATTG GCAAA60QAA GCTG0G6CCT CTTCCAGAAA CTAGGAGAAT 1320 

ATTACTTACA AAATGOGTTT CTCOTTGCTT ACACAAA6AA AGCCCOCCAG CTGACCTOGT 1380 

OGGAGCTQAT GGCCATCACC AGAAAAATGG CAGCCAOVGC AGCCACTTGT TGCCAACTCA 1440 

GTGA6GACAA ACTATTGQCC T6TG60BAGG GA60GGCTGA CATTATTATC GGACACTTAT 1500 
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GTATCAGACA TGAAA3GACT OCAGTAAACC ClGGTGTTq S O CA gTGCTGC ACTTCTTCAT 1560 

ATGCCAACAO GAGGCCATGC TTCAGCAGCT TGGTGGTGGA TGAAACATAT GTCCCTCCTG 1620 

CATTCTCTGA TQACAAGTTC ATTTTCCATA AGGATCTGTG CCAAGCTCAG GGTGTAGCGC 1680 

TGCAAAOGAT GAAGCAAGAG TTTCTCATTA ACCTTGTGAA GCAAAAGCCA CAAATAACAG 1740 

AGGAACAACT TGAGGCTGTC ATTGCAGATT TCTCAGGCCT 6TTGGAGAAA TGCTGCX:aAG 1800 

GCCAGGAACA GGAAGTCT6C TTTGCTGAAG AGGGACAAAA ACT GA ITTC A AAAACT06TG 1860 

Clli C 'lT'iX i GG AGTTTAAATT ACTTCAGGGG AA6AGAAGAC AAAAOGAGTC TTTCATTCG6 1920 

TGTBAACrTT TCTCPPTAAT TTTAACTGAT TTAACACTTT TTGTGAATTA ATGAAATGAT 1980 
AAAGACTTTT ATGT6AQATT TOCTTATCAC A6AAATAAAA TATCTCCAAA TG 

Seq ZD NO I 3BS Protein sequence 
Protein AcceBsion #t HP_001125 

1 11 21 31 41 51 

i I I I I I 

KKWVESIFLI FLLHFTESRT IiBRNBYGIAS XLOSYQCTAB ISLADIATIF FAQPVQEATY 60 

REVSKHVKDA LTAIEKPT(3>- EQSSGCLENQ LFAFLESLCE EKBILEECVO! SDCCSQSEEG 120 

RHNCFLAHKK PTPASIPLFQ VPEPVTSCBA YEEDRETFMS KPIYEIARRH PPLYAPTILL 180 

HAARYDKIIP SOCKAENAVB CPQTKAATVT KELRESSLLM QHACAVMKNP GTRTPQAITV 240 

TKLSQKPTKV NPTEIQKLVL DVAHVHEHCC RGDVLDCLQD GEKIMSYICS QQDTLSNKIT 300 

ECCKLTTLER GQCIIHAEND BKPBtJLSPNL NRFIiGDIlDFN QFSSGEKNIP LASFVHEYSR 360 

RHPQLAVSVI LRVAKGYQEL LEKCPQTENP LECQDKGEEE LQKYIQESOA LAKRSOGLPQ 420 

KLGEYYIiOMA FLVAYTKKAP QLTSSELMAI TRKKAATAAT COQLSEDXLL ACGBGAADIZ 4 BO 

XGRLCZRHEM TPVNPGVGQC CTSSYANRRP CFSSLWDET YVPPAFSZZDK FIFHRDLCQA 540 

QGVALQTMRQ EPLINLVKQK PQZTEEQIiEA VIAOFSGLLB ROOQGQSQEV CFAEBGQKLI 600 
SKTRAAXiQV 

Seq 10 NOt 366 OKA sequence 

Nucleic Acid Accession ftt MM_002205.1 

Coding sequence: 1..3149 

1 11 21 31 41 51 

I I . I • I I I 

AT6GGGAGCC GGAOGCCAGA GTCCCCTCTC CAaSCOOTGC AGCT60GCTG GGGCCXCOGG 60 

C3GCCGACCCC CGCTSSTGCC GCTGCTGTTC CTGCTSSTGC OGCOGCCACC CAGGC5T0GGG 120 

GGCTTCAACT TAGAC3G0GGA GGCCCX»GCA GTACTCTCGG GGCCCOOGGG CTCCTTCTTC 180 

6GATTCTCA6' TGGA6TT7TA CXX3GCCGGGA ACAGAG6GGG TCAGTGT6CT GGTQGGAGCA 240 

COOVAGGCTA ATACCAGCCA G0CA66AGTG CTGCAGGGTG QTGCTGT CT A CCTCTOTCCT 300 

TGGGSTOCCA GCCCCACACA GTGCACOCOC ATT6AATTT0 ACA6CAAAGG CTCT066CTC 360 

CTGGAGTCCr CACTGTCCAG CTCAGAGGGA GAGGAGCCTG TGGAGTACAA GTCCTTGCAG 420 

TGGTTOGGGG CAACAGTTCX3 AGCCCATGC5C TCCTCCATCT TGGCATGOGC TCCACTGTAC 480 

AGCTGGOGCA CAGAGAAGGA GCCACTGA6C GACCCCGTGG GCACCTGCTA CCTCTGCACA 540 

QATAACTTCA C0C6AATTCT OGAGTATGCA OOCTGCOQCT GAQATTTCAO CTQGOCAGCA 600 

GGACAGGGTT ACTGCCAAGG AGGCTTCAGT GOCGAGTTCA OCAAGACTGG OCOTGTGOTT 660 

TTAGGTGGAC CAGGAAGCTA TTTCTGGCAA GGCCAQATCC TGTCT6CCAC TCAGGAGCA6 720 

ATTGCAGAAT CTTATTACCC OOAGTACCTG ATCAACCTGG TTCAGGGGCA GCTGCAGACT 780 

CGCX3«36CCA GTTCCATCTA T6ATGACAGC TACCTAGGAT ACTCTGTGGC TGTTGGTGAA 840 

TTCAOTOGrG ATGftCACAGA AGACTTTGTT GCTGGTGTQC 0CAAAG66AA GCTCACTTAC 900 

G6CTATGTCA CCATOCTTAA T66CTCAGAC ATTC6ATCCC TCTACAACTT CTCAOGGGAA 960 

CAGATGGCCT CCTACTTTGG CTATGCAGTG GOOSCCACAG AOGTCAATCG OGACGOGCTG 1020 

GATGACTTGC TGGTGGGGGC ACCCCXXSCTC ATQGATCX36A CCCCTGACGG GCGGCCTCAG 1080 

GAGGTGGGCA GGGTCTACGT CTACCTGCAG CACCCA6C06 GCATAGAGCC CACX3CCCACC 1140 

CTTAOOCTGA CTOOOCATGA TGAGTTIGGC GGATTTQ6C3V GCTCCTTGAC CCCOCTGGGG 1200 

QACCT66ACC AG6AT66CTA CAATGAT6TG GCCATG66GG CfCCC m x aG TQGGGAGACC 1260 

CAGCAGGGAG TAGTGTTTGT ATTTCCTGGG GGCCCA6GAG GOCTGGGCTC TAAGCCTTCC 1320 

CAGGTTCTGC AGCCCCTGTG GGCAGCCAGC CACACCCCAG ACTTCTTTQG CTCTGCCCTT 1380 

C6AGGAG6CC GAGACCTGGA T6GCAATGGA TATCCTGATC TGATTGTGGG GTCCTTTGGT 1440 

GXGQACftAGQ CTGTOOrATA CAGGGGCOGC OCC A TCGTGT OOGCTAGTGC CTCGCTCAOC 1500 

ATCTTOCCOQ CCATGTTCAA CCCAGAGGAG OGGAGCTGCA GCTTAGAGG6 GAACCXTTSTG 1560 

GCCTQCATCA ACCTTAGCTT CTGCCTCAAT GCTTCTGGAA AACAOGTTGC TGACTCCATT 1620 

GGTTTCACAG TGGAACTTCA GCTGGACTGG CA(SAAGCAQA AGOQAGGGGT ACXKXX3G6CA 1680 

CTGTTCCTGG CCTCCAGGCA GOCAACCCTG ACCCAGACCXT TGCTCATCXA GAATGGGGCT 1740 

C6AGAGGATT GCA6AGAGAT OAAGATCTAC CTCAGGAA08 AGTCAGMVTT T0GA6ACAAA 1800 

CTCTC6C0GA TTCACATC6C TCTCAACTTC TOCTTGGACC COCAAQCCCC ASTGGACAOC 1860 

CACXWCCTCA GGCCAGCCCT ACATTATCAG AGCAAGAGCXI OGATAQAGGA CAAOGCTCAG 1920 

ATCTTGCTGG ACTGTGGAGA AQACAACATC TGTGTGCCTG ACCTQCAGCT GGAAGTGTTT 1980 

6GGGAGCA6A ACCAT6TGTA OCTGGGTGAC AAGAATGCXX TGAACCTCAC TTTCCATGCC 2040 

CA6AATGTGG GTGAG3GTG6 OXCXATGAG GCTGAGCTTC 0GGTCAC06C COCTOCAGAG 2100 

GCTGAGTACT CAG6ACT0GT CA6ACACCCA GGGAACTTCT CCA6CCTGA6 CTGTGACTAC 2160 

TTT0CXX3TGA ACCAGAGCC3C CCTCCPGGTS TCTGAOCPGG 6CAACXXCAT GAA06CAGGA 2220 

GCCAGTCTGT GGGGTGGCCT TOGGTTTACA GTCCCTCATC TCOGGGACAC TAAQAAAACC 2280 

ATCCAGTTTG ACTTCOWSAT CCTCAGCAAG AATCTCAACA ACTOGCAAAQ CGAOGTGGTT 2340 

TCCTTTCG6C TCTCOGTGGA GQCTCAGGGC CAGQTCAOOC TGAA06GTGT CTCCAAGCCT 2400 

GAGGCAGT6C TATTCCCAGT AA0CX3ACT0G CATCCCC96AG ACCA6CCTCA 6AAGGAGGAG 2460 

GACCTGGGAC CTGCTGTCCA CCATGTCTAT GAGCTCATCA ACCAAGGCCC CAGCTCCATT 2520 

AGCCAGGGTG TGCTGGAACT CAGCTGTCCC CAGGCTCTGG AAGGTCAGCA GCTCCTATAT 2580 

GTGACCAGA8 TTAOGGGACT CAACTQCACX: ACCAATCAOC GCATTAACCC AAAGGGCCTG 2640 

GAGIIGGATC OGGAGGGTTC CCT6CA0CAC CAGCAAAAAC GGGAA6CT0C AAG006CAGC 2700 

I VrGC r X'CCr GGGGACCTCA GATCCTGAAA TGCOCGGAGG CTGAGTGTTT CAGGCT G CGC 2760 

TGTGAGCTC6 GGCCCCTGCA CCAACAAGAG AGCC3UUU3TC TGCASTTGCA TTTCCGAGTC 2820 

TGQGCCAAGA CTTTCTTGCA GCGGGAGCAC CAGCCArrTA GCCTGCAGTG TOAGGCTGTQ 2880 

TACAAA6CCC TGAAGATGCC CTACCGAATC CTGCCTOGGC AGCTGCCCCA AAAAGAGCGT 2940 

CAGGTGGCC3V CAGCTGTGCA ATGGACCAAG GCAGAAOGCA GCTATGGCGT CCCACTGTG6 3000 

ATCATCATCC TAGCCATCCT GTTTGGCCTC CTGCTCCTAG GTCTACTCAT CTACATCCTC 3060 

TAGAAGCTTG GATTCTTCAA AOGCTCOCTC CCAXATGGCA CCG0CATG6A AAAAGCTCAG 3120 
CICAAQOCTC C31QCC!ACCTC TGATGCCIGA 
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PCTAJS02/12476 



Seq ID NOt 387 Protein sequence 
Protein Accession S: NP_002 196.1 

Si 11 21 31 41 51 

I t I I I I 

MGSRTPESPL RAVQLRHGPR RRPPIiLPLLL LLLPPPPRVG GFHLDABftPA VLS0PP6SFF 60 

GPSVEPYRPG TDGVSVLVGA PKANTSQPGV LCX5GAVYLCP WGASPTQCTP lEFDSKGSRL 120 

LESSLSSSBG EEPVEYKSLQ WFC5ATVRAHG SSILACAPLY SWRTEKEPIiS DPVGTOfLST 180 

10 DNFTRILEYA PCRSDPSKAA GQGYOQGGPS ABFTKTGRW LGGPGSYFWQ GQILSATQEQ 240 

lABSyyPBVL 'ZNLVQGQLQT RQASSXym)S YLGYSVAVGE FSGDDTBDFV AGVPKCaiLTY 300 

GYVniiNGSD IB6I1YKF8GB QMASYFGYAV AAtDVNGDGL DDI.LVGAPI1L KDRTPDGRPQ 360 

CVGRVYVYLQ BPAGIEPTPT LTLTGHDEFG RFGSSLTPLG DLDQDGYNDV AIGAPFGGET 420 

QQGWPVFPG GPGGLGSKPS QVLQPLHAAS HTTOFFGSAL RGGRDLDQJG YPDLIVGSPG 480 

15 VDKAWYRGR PIVSASASLT IPPAMFNPEE RSCSI1B6NPV ACZHLSFCLH ASGKBVADSI 540 

6FTVELQLDH QKQKGGVRItA LPIASRQATL 1QTLLIQKGA SEDCREHKIY LRNESEFRDK 600 

LSPIHIAIiNF SLDPQAPVDS RGLRPALHYQ SKSRIEDKAQ ZLLDOGEDHI CVFDLQLEVP. 660 

GBQKHVYLGD KNALNLTPHA QNVGEGGAYB AELRV7APPE AEYSGLVRHP Gt^^SSLSCDY 720 

FAVNQSRIiV CDLCaiPMKAG ASLWGGLRFT VPHLRDTKKT IQFDFQILSK NLNNSQSCW 780 

20 SFRLSVEAQA QVTLNGVSKP EAVLFFVSDW HFROQPQKEE DLGPAVRHVY ELINOGPSSI 840 

SQGVLBLSCP QALEGQQLLY VTRVTGI^TCT TNHPINPKGL ELDPEGSLHH CXJKREAPSRS 900 

SASSGPQZIiK CPEAECFRLR CBLGPLHQQE 8QSLQLHFRV HAKTFLQREH QPFSLQCEAV 960 

yKALKKPYRZ LPRQLPQKBR OVATAVQHTR AB6SYGVPLH ZIIIAZLFGL LLLGLiZYlL 1020 
YKLGPPKRSL PYGTAMEKAQ LKPPATSDA 



25 



65 



80 



Seq' ZD NO: 388 DNA sequence 

Nucleic Acid Accession 9: NN_002425 

Coding sequence: 26.. 1453 



30 1 11 21 31 41 51 

I I i i I I 

AAAGAAGGTA AGGGCAOTGA GAATGATGCA TCTTGCATTC CTTGTGCTGT TGTGTCTGCC 60 

AGTCTGCTCT 6CCTATCCTC TGAGTGGG6C AGGAAAAGAG GAGGACTCCA ACAAGGATCT 120 

T6CCaU3CAA TACCTAGAAA AGTACTACAA CCTOGAAAAG GATGTGAAAC AGTTTAGAAG 180 

35 AAAGGACAGT AATCTCATTG TTAAAAAAAT CXaAGQAATG CAGAAGTTCC TTGGGTTG6A 240 

GGTGACAGGG AAGCTAGACA CTGACACTCT GGAGGTGATG CGGAAGCCXA GGTGTGGAGT 300 

TCCTGAOSTT GGTCACTTCA GCTCCTTTCC TGGCATGCOS AAGTGGAGGA AAACCX31CCT 360 

TACATACAG6 ATTGTGAATT ATACACCA6A TTTGCCMGA GATGCTGTTB ATTCTGCCAT 420 

TGAGAAA6CT CTGAAAGTCT GGGAAGAGGT GACTCCACTC ACATTCTCCA G6CTGTATGA 480 

40 AGGAGAGGCT GATATAATGA TCTCTTTC6C AGTTAAAGAA CATGGA6ACT TTTACTCTTT 540 

TGATGGCCCA GGACACAGTT TGGCTCATGC CTACCCACCT GGACCTGGGC TTTATGGAGA 600 

TATTCACTTT GATGATGAT6 AAAAATGGAC AGAAGATGCA TCAGGCACXA ATTTATTCCT 660 

O G TTQCT G CT CATGAACTT6 GCCACTCCCT G6GGCTCTTT CACTCAGOCA ACACTGAAQC 720 

TTTGATGTAC OCACTCTACA ACTCATTCAC AGAGCTOGCC CAGTTCCGCC TTTOOaWfiA 780 

45 TGATGTGAAT GGCATTCAGT CTCTCTACX^G ACCTCCCCCT 6CCTCTACTG AGGAACCCOT 840 

GGTGC(XACA AAATCTGTTC CTTCGGGATC TGAGATGCCA GCCAAGTGTQ ATCCTGCTTT 900 

GTCCrrcGAT GCCATCAGCA CTCTGAGGGG AQAATATCTG TTCTTTAAAG ACAGATATTT 960 

TTGGCQAAGA TCCCACTOGA ACCCTGAACC TGAATTTCA7 TTGATTTCTQ CA TTTTGO OC 1020 

CTCTCTTCCA TCATATTT66 ATOCTGCATA TGAAGTTAAC AOCAGGGACA COGTTTTTAT 1080 

50 TTTTAAAGGA AATGAGTTCT GGGCCATCAG AGGAAAT6AG GTACAAGCAG GTTATCCAAG 1140 

AGGCATCCAT ACCCTGGGTT TTCCTCCAAC CATAAGGAAA ATTGATGCAO CTGTTTCTGA 1200 

CAAGGAAAAO AAGAAAACAT ACTTCTTT6C AGOQGACAAA TACTGGAGAT TTGAT6AAAA 1260 

TAGCGAGTCC ATGGAGCAAS GCTTCCCTAO ACTAATAGCT GATGACTTTC CAGGAGTTGA 1320 

GOCTAAGGTT 6ATGCTGTAT TACAG6CMT TGGA' mi TC TACTTCTTCA GTGGATCATC 1380 

55 ACAGTTTGAG TTTGACCCCA ATGCCAGGAT GGT6ACACAC ATATTAAAGA GTAACAGCTG 1440 

GTTACATTGC TAGGCGAGAT AGGGGGAAGA CAGATATGGO TGTTTTTAAT AAATCTAATA 1500 

ATTATTCATC TAATGTATTA TGAGCCAAAA TOGTTAATTT TTO CTGCA TG TTCTGTX5ACT 1560 

GAAGAAGATQ AGOCTTGCAG ATATCT6CAT GTGTCATGAA GAATGTTTCT GGAATTCTTC 1620 

^ ACTTGCTTTT QAATTGCACT GAACAGAATT AA6AAATACT CATGT6CAAT AGGTGAGAGA 1680 

60 ATGTATTTTC ATAGATGTGT TATTACTTCC TCAAXAAAAA GTTTTATTTT G060CTGTTC 1740 
CTT 



Seq 10 HOt 389 Protein sequence 
Protein Accession it NP_002416 



1 11 21 31 41 51 

I I I I I I 

MHLAFLVLLC LPVCSAYPLS GAAKBBDSNK DZiAQOYLEKy YNLSXDVKQF BRXDSHIilVK 60 

KZQGKQKPIiG LBVTGKLDTO TLBVMRKPRC CVPOVGHPSS FPQ1PXWR1CT HLTVRZVHYT 120 

70 PDLPSDAVDS AZEKALKVWB EVTPLTPSRL YBGEADIMIS FAVKEHGDFY SFDGPGHSLA 180 

HAYPPGPGLY GDIHFDDDBK WTE33ASGTML PLVAAHELGH SLGLFHSANT EALMYPLYNS 240 

FTELAQPRLS QDDVNGZQSL YGPPPASTEB PLVFTKSVPS GSEHPAKCSP ALSPDAISTL 300 

RGEYLFFKDR YFWRRSHWNP EPEFHLZ8AF WPSLPSYLDA AXBVNSHOTV FZPKGKEFHA 360 

IRGNEVQAGY PRGZHTLGFP PTZRKIDAAV SOKEKKKTYF FAADIOmRFD ENSQSMEQGF 420 

75 PRLIADOFPG VEPKVDAVLQ AFGFPYPPSO SSQFEFDFKA RHVTBXLKSH SHLHC 



Seq ZD NO: 390 DNA sequence 

Nucleic Acid Accession #1 NH_002421.2 

Coding eequeztfe: 1..1409 



1 11 21 31 41 51 

I I I I I I 

ATGCACAGCT TTCCTCCACT GCTGCT G CTG CTGTTCTGOG GTqTGGTGTC ACAGAGCTTC 60 

CCAGOGACTC TAGAAACACA AGAGCAAGAT GTGGACTTAG TCCAGAAATA CCTGGAAAAA 120 

85 TACTACAACC TGAAGAATGA TGGGAGGCAA GTTGAAAAGC GGAGAAATA6 TGGCCCAGTG 180 

GTTGAAAAAT TGAAGCAAAT 6CAGGAATTC TTTGGGCTGA AAGTGACTGG GAAACCAGAT 240 

GCTGAAACCC T6AAGGT6AT GAAGCAGCCC AGAT6TGQA0 TGCCTGATGT G6CTCAGTTT 300 
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(STCCTCACTG AGGGGAAOOC TGGCTGGGAG CAAACACATC 
TACSiCGOCAO ATTTCCCAAO AGCAGATGTO GACCATGOCA 
TGGAGTAATG TCACACCTCT GACATTCACC AAGGTCTCTG 
ATATCTTTTG TCAGGGGAGA TCATOGGGAC AACTCTCCTT 
CTTGCTCATG CTTTTCAAOC AGGCOCAGGT ATTGGAGGGG 
GAAAGGIGGA CCAACAATTT CA6AGAGTAC AACTTACATC 
GGCCATTCTC TTGGACTCTC CCATTCTACT GATATCX3QG0 
ACCTTCAGTG GTGATGTTCA GCTAGCTCAG GATGACATTG 
GGACGTTCCC AAAATCCTGT CCAGCCCATC GGCCCAOUUl 
AAGCXAACCT TTGATGCTAT AACTAOGATT CGGGGAGAAG 
TTCXACAIGC GCACAAATGC CTTCTACCOQ GAAGTTGAGC 
TGGOCACAAC TGCCAAATG6 GCTT6AA6CT GCTTAOGAAT 
OGGTTTTTCA AAGGGAATAA GTACTGGGCT GTTCAGGGAC 
CXXAAGGACA TCTACAGCTC CTTTGGCTTC CCTAGAACTG 
CTTTCTGAGG AAAACACTGG AAAAACXrTAC TTCTTT6TTG 
GATGAATATA AACX3ATCTAT GGATCCAGGT TA7CCCAAAA 
GGAATTG6CC ACAAA6TTGA TGCAGTTTTC ATGAAAGATG 
GGAACAA6AC AATAC3UUITT T6ATCCIAAA A06AAGAGAA 
AATA6CTGGT TCAACTGCAG GAAAAATTAG 

Seq ID NO: 391 Protein sequence 
Protein Accession #: NP_002412.1 



PCT/US02/12476 



1 

I 

MHSFPPLLLIi 

YTPDLPRADV 
LAHAFQPGPG 
TFS(3}VQLAQ 
FYMRTNPFYP 
PKDIYSSFGF 
GIGBKVDAVF 



11 
I 

LFW6WSH8F 
FGLKVTGKPD 
DHAIEKAFQL 
IG6DAHFDED 
DDIDGIQAIY 
EVBLNFISVF 
PRTVKHZDAA 
MKDGFFYFFH 



21 
I 

PATLBTQEQD 
AETLKVMKQP 

WSNVTPLTFT 
ERWTNNFREY 
GRSQMPVQPI 
WPQLPNGLEA 
LSEEafTGKTY 
GTRQYKFDPK 



31 

! 

VDLVQKnjEK 
ROGVPDVAQF 
KVSEGQADIM 
NLHRVAAHEL 
GPQTPKACDS 
AYBFADRDEV 
FFVANKYHRY 
TKRILTLQKA 



TGACCTACA8 


GATT6AAAA7 


360 


TTGAGAAA6C 


CTTGCAACTC 


420 


AGGGTCAAGC 


AGACATCATG 


480 


TTGATGGACC 


TGGAGGAAAT 


540 


ATGCTCATTT 


TGATGAAGAT 


600 


QT0TTG06GC 


TCATGAACTC 


660 


CTTTGATGTA 


UUUlAGCTAC 


720 


AT6GCATCCA 


AGCCATATAT 


780 


CCCCAAAAGC 


AT6TGACAGT 


840 


T6ATGTTCTT 


TAAAGACAGA 


900 


TCAATTTCAT 


•ITLWmi'C 


960 


TTGC0GACA6 


AGATGAAGTC 


1020 


AGAAkTGTGCT 


ACACGGATAC 


1080 


T6AAGCATAT 


CGATGCTGCT 


1140 


CTAACAAATA 


CTGGAGGTAT 


1200 


TGATAGCACA 


TGACTTTCCT 


1260 


GATTTTTCTA 


TTTCTTTCAT 


1320 


TTTTGACTCT 


CCA6AAAGCT 


1380 


41 
1 


SI 
1 




1 

VYNLKinXsEtQ 


1 

VEKRBNSGFV 


60 


VLTEGNPRWB 


QTHLTyRZEN 


120 


ISFVRGDHRD 


NSPFDGPGGN 


180 


GHSLGLSHST 


DIGALMYPSY 


240 


KLTFDAITTI 


RGEVKFFXDR 


300 


RFFXGKKYWA 


VQGQMVLH6Y 


360 


DEYKRSMDPG 


YPKHIAHDFP 


420 


NSWFKCRKN 







Seq ID NO: 392 DNA sequence 

Nucleic Acid Accession #: NM_002421.2 

Coding sequencet 1..1409 



1 
I 

ATGCACAGCT 
CCA6CGACTC 
TACTACAACC 
GTTGAAAAAT 
GCTGAAACCC 
GTCCTCACTG 
TACACGCCAG 
TGGAGTAATG 
ATATCTTTTG 
CTTGCTCATG 
GAAAGGTGGA 
GGCCATTCTC 
ACCTTCAGTG 
GGACGTTCCC 
AAGCTAACCT 
TTCTACATGC 
TGGCCACAAC 
CGGTTTTTCA 
CCCAAGGACA 
CTTTCTGAGG 
GATGAATATA 
GGAATTGGCC 
G6AACAAGAC 
AATA0CT6GT 



11 

I 

TTCCTCCACT 
TAGAAACACA 
TGAAGAATGA 
TGAAGCAAAT 
TGAAGGTGAT 
AGGGGAACCC 
ATTT6CCAAG 
TCACACCTCT 
TCAGGGGAGA 
CTTTTCAACC 
CCAACAATTT 
TTGGACTCTC 
OTQATOTTCA 
AAAATCCTGT 
TTGATGCTAT 
GCACAAATGC 
TGCCAAATGG 
AAGGGAATAA 
TCTACAGCTC 
AAAACACTGG 
AACGATCTAT 
ACAAAGTTGA 
AATACAAATT 
TCAACTGCAG 



21 

1 

GCTGCTGCTG 
AGAGCAAGAT 
TGGGAG6CAA 
GCA06AATTC 
GAAGCAGCCC 
TOGCTGGGAO 
AGCAGATGTG 
GACATTCACC 
TCATOGGGAC 
AGGCCCAGGT 
CAGAGAGTAC 
CCATTCTACT 
GCTAGCTCAG 
CCAGCCCATC 
AACTACGATT 
CTTCTACCOQ 
GCTTGAAQCT 
GTACTGGGCT 
CTTTGGCTTC 
AAAAACCTAC 
GGATCCAGGT 
TGCAGTTTTC 
TGATCCTAAA 
GAAAAATTAG 



31 



GTGGACTTAG 
GTTGAAAAGC 
TTTG66CTGA 
AGATGTGGAG 
CAAACACATC 
GACCATGGCA 
AAGGTCTCTG 
AACTCTCCTT 
ATTGGAGGGG 
AACTTACATC 
GATATOGGGG 
GATGACATTG 
GGCCCACAAA 
OGGGGAQAAG 
GAAGTTGAGC 
GCTTAOGAAT 
GTTCAGGGAC 
CCTAGAACTG 
TTCTTTGTTG 
TATCCCAAAA 
ATGAAAGATG 
A06AA6AGAA 



Seq ID NO I 393 Protein sequence 
Protein Accession %i HP 002412.1 



1 
I 

MHSPPPIJiLL 
VEKLXQMQSF 
YTPDLPRADV 
LAHAFQPGPG 
TPSGDVQLAQ 
FYMRIKPFYP 
PKDIYSSFGF 
GIGHKVDAVF 



11 

I 

LFHGWSHSF 
FGLKVTGKPD 
DHAIEKAFQL 
IGGDAHFDED 
DDIDGIQAIY 
EVELNFISVF 
PRTVKHIDAA 
MKDGFFYFFH 



21 
I 

PATLSTQSQD 
AETLKVPfKQP 
WSNVTPLTFT 
ERHTNNFRBY 
GRSQNPVQPI 
HPQLPHGLEA 
LSEQITGKTY 
GTRQYBCFDPK 



31 

1 

VDLVQKYLEK 
ROGVPDVAQF 
KVSBGQADIM 
NLHRVAAHAL 
GPQTPKACDS 
AYEFADRDEV 
FFVANKYHRY 
TKRILTLQKA 



41 

1 

GTGTGGTGTC 
TCCAGAAATA 
G6AGAAATAG 
AAGT6ACTGG 
TGCCTGATGT 
TGACCTACAG 
TTGAGAAAGC 
AGGGTCAAGC 
TTGATGGACC 
ATGCTCATTT 
GTGTTGCGGC 
CTTTGATGTA 
ATG6CATCCA 
CCCCAAAAGC 
TGATGTTCTT 
TCAATTTCAT 
TTGCCGACA6 
AGAATGTGCT 
T6AAGCATAT 
CTAACAAATA 
TGATAGCACA 
GATTTTTCTA 
TTTTGACTCT 



41 
1 

YYNLKMDGRQ 
VLTEGNPRWS 
ISFVROSRRD 
GHSLGLSHST 
KLTFDAITTI 
RFFKQNKYWA 
DEYKRSMDPG 
MSWFNCRKN 



51 
I 

ACACAGCTTC 
CCTGGAAAAA 
TGGCCCAGTG 
GAAACCAGAT 
GGCTCAGTTT 
GATTGAAAAT 
CTTCCAACtc 
AGACATCATG 
TGGAGGAAAT 
TGATGAAGAT 
TCATGCCCTC 
CCCTAGCTAC 
AGCCATATAT 
ATGTGACAGT 
TAAAGACAGA 
TTCTGTTTTC 
AGATGAAGTC 
ACACGGATAC 
CGATGCTGCT 
CTGGAGGTAT 
TGACTTTCCT 
TTTCTTTCAT 
CCAGAAAOCT 



51 
1 

VEKRRNSGPV 
QTBLTYRIEN 
KSPFDGPGOf 
DIGALMYPSY 
RGEVMPFKDR 
VQGQNVLHGY 
YPRKIAHDFP 



Seq ID NO: 394 DKA sequence 

Nucleic Acid Accession «i NM_014331.2 

Coding sequence I 1..1506 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 
240 
300 
360 
420 



11 
I 



21 
I 



31 



41 

I 



51 
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ATGGTCAGAA A6CCT6TTGT GTCCACCWTC TCCAAAGQKB GTTACCT6CA GGGAAATGTT 60 

AAOSG6AGGC T6CCTTCCCT GGGCAACAAG GftGCCACCIG C3GCA6GAGAA A6TGCA6CTG 120 

AAGAGGAAAG TCACTTTACT GAGGGGAGTC TCCATTATCA TTGGCACCAT CATTGGAGCA IBO 

GGAATCTTCA TCTCTCCTAA GGGCGTQCTC CAGAACACGG GCAGC3GTGGQ OVTGTCTCTG 240 

ACCATCTGGA CGGTGTGT6G GGTCCTGTCA CTATTTGGAG CTTT6TCTTA TGCTGAATTG 300 

GGAACAACTA TAAAGAAATC TGGAGGTGfVT TACACATATA TTTIGGAAGT Cm X ^GTOCa 360 

TTAOCAGCTT TTGTAOGAGT CTGGGTGGAA CTCCTC A TAA TAOGCCCTGC AGCTACTGCT 420 

GTGATATCCC TGGCATTTGG ACGCTACATT CTGGAACCAT TTTTTATTCA ATGTGAAATC 480 

CCTGAACTTG CXSATCAAGCT CATTACAGCT GTGGGCATAA CTGTAGTGAT GGTCCTAAAT 54 0 

AOCATGAGTO TCAGCTQGA6 06CC0GGATC CAGATTTrCT TAACCTTTTG CAA6CTCACA 600 

6CAATTCIGA TAATTATAGT CCXrTGGAGTT A1GCAGCTAA TTAAAGGTCA AAC6CAGAAC 660 

TTTAAAGA06 0GTTTTCAS6 AAGAGATTCA AGTATTACSC OGTTSCCACr G6C1TTTTAT 720 

TATGGAATGT ATGCATATGC TGGCTGGTTT TACCTOVACT TTGTTACTGA AGAA0TA6AA 780 

AACCCTGAAA AAACCATTCC CCTTGCAATA TGTATATCCA TGGCCATTGT CACCATTGGC 840 

TATGT6CTGA CAAATGTGGC CTACTTTAOG ACCATTAATG CTGAGGAGCT 6CT6CTTTCA 900 

AAT6CA62X3S GAGTGACCTT TTCTGA6CG3 CZACTGGGAA ATTTCTCATT A6CAGTTCCB 960 

ATCTTTGTTG CCCTCTCCTG CTTTGGCTCC ATGAAO G GT G GTGTGTTTGC TGTCTCCAGG 1020 

TTATTCTATG TTGCGTCTCG AGAGGGTCAC CTTCCAGAAA TCCTCTCCAT GATTCATGTC 1080 

CGCAAGCACA CTCCTCTACC AGCTGTTATT GTTTTGCACC CTTTGACAAT QATAATGCTC 1140 

TTCTCTGGAG ACCTOGACAG TCTTTTGAAT TTCCTCAGTT TTGCCAGGTG GCTTTTTATT 12 OO 

GGGCTGGCAG TTGCT6GGCT GATTTATCTT G6ATACAAAT 6CCCAGATAT 6CAT0GTCCT 1260 

TTCAAOGTGC CACTGTTCAT CCCAGCTTTG TTTTCCTTCA CATGCCTCTT CATGGTTGCX: 1320 

CTTTCCCTCT ATTCGGACCC ATTTAGTACA GGGATTGGCT TOGTCATCAC TCTGACTGGA 1380 

GTCOCTGCGT ATTATCTCTT TATTATATGG GACAAGAAAC CCAGGTGGTT TAGAATAATG 1440 

TCAGAGAAAA TAACCAGAAC ATTACAAATA ATACTGGAAG TTGTACCAGA AGAAGATAAG 1500 

TTATXSAACTA AT6GACTTGA GATCTTGGCft ATCTGCOCAA GGGGA6ACAC AAAftTAGGGA 1560 

TTTTTACTTC ATTTTCTGAA AGTCTAGAGA ATTACAACTT TGGT6ATAAA CAAAAGGAGT 1620 

GAOTTATTTT TATTCATATA TTTTAGCATA TTOGAACTAA TTTCTAAGAA ATTTAGTTAT 1680 

AACTCTATGT AGTTATAGAA AGTGAATATG CAGTTATTCT ATGAGTOSCA CAATTCTTGA 1740 

GTCTCTGATA CCTACCTATT GGGGTTAGGA GAAAAGACTA GACAATTACT ATGTGGTCAT 1800 

TCTCTACAAC ATATGTTAOC ACGGCAAAGA ACCTTCAAAT TGAAGACTGA GATTTTTCTG 1860 

TATATATGGG TTTTGTAAAa ATGGTTTXAC ACACTACAGA TGTCTATACT GTGAAAA6TG 1920 

TTTTCAATTQ TQAAAAAAAO CATACATCAT QATTATGGCA AAGAGGAGAO AAAGAAATTT 1980 

ATTTTACATT GACATTGCAT TGCTTCCCCT TAGATACCAA TTTAGATAAC AAACACTCAT 2040 

6CTTTAATGG ATTATACCCA GAGCACTTTG AACAAAOGTC AGTGGGGATT GTTGAATACA 2100 

TTAAAGAAGA GTTTCTAGGG GCTACTGTTT AT6AGACACA TCCAGGAGTT ATGTTTAAGT 2160 

AAAAATCCTT GAGAATTTAT TATQTCA6AT GTTTTTTCAT TCATTATCAG GAAGTTTTAG 2220 

TTATCTGTCA ' ITriTm TT TCACATCAGT T7GATCAGGA AAGT6TATAA CACATCTTAQ 22 BD 

AGCAAOAGTT AOTTTGGTAT TAAATCCTCA TTA6AACAAC CACCTGTTTC ACTAATAACT 2340 

TACOCCTGAT GAGTCTATCT AAACATATGC ATTTTAAGCX: TTCAAATTAC ATTATCAACA 2400 

TGAGAGAAAT AACCAACAAA GAAGATGTTC AAAATAATAG TCCCATATCT GTAATCATAT 2460 

CTACATGCAA TGTTAGTAAT TCTGAAGTTT TTTAAATTTA TGGCTATTTT TACACGATGA 2520 

TCAATTTTGA CAOTTTGTGC ATTTTCTTTA TACATTTTAT ATTCTTCT6T TAAAATATCT 25B0 

CTTCAGATGA AACTGTCCAQ ATTAATTAOG AAAAGGCATA TATTAACATA AAAATT6CAA 2640 

AAGAAATGTC GCTGTAAATA AGATTTACAA CT6ATGTTTC TAGAAAATTT CCACTTCTAT 2700 

ATCTAGGCTT TGTCAGTAAT TTCCACACCT TAATTATCAT TCAACTTOCA AAAGAGACAA 2760 

CTGATAAGAA GAAAATTGAA ATGAGAATCT GTGQATAAGT GTTTGTGrTC AGAAGATGTT 2820 

GTTTTOGCAO TATTAGAAAA TACTOTGAGC OC5G0CATGGT GGCTXACATC T8TAATGCCA 2880 

GCACTTTGGG AGGCTGAG6G 66TGGATCAC CT6A6GT0G6 GA G ITCT A GA CCA6GCTGAC 2940 

CAACATG6AG AAACCCCATC TCTACTAAAA ATACAAAATT AGCTGGGCAT GGTGGCACAT 3000 

GCTGGTAATC TCAGCTATTG AGGAGGCTGA GGCAGGAGAA TTGCTTGAAC CXXWGAGGOG 3060 

GAGGTTGCAG TGAGCCAAGA TTGCACCACT 6TACTCCAGC CTGGGTGACA AAGTCAGACT 3120 
GCATCTCCAA AAAAAAAAAA AAAA 

Seq ID NO} 395 Protein sequence 
Protein Accession «t 2ffP_055146.1 

1 11 21 31 41 51 

1 I I I I I 

MVRKPWSTI SKGGYLQGNV NGHLPSLQIK EPPGQEKVQI* KRKVTLLRGV SIIIGTIIGA 60 

QIFISPKGVL QNTGSVGMSL TIWTVCGVLS LFGALSYAEL GTTIKKSGGH YTYILEVPGP 120 

LPAPVRVWVB LLIIRPAATA VISXAFGRYX LEPFFIQCBl PELAIKLITA VQITWMVIiK 180 

SMSVSHSARX QIFLTFCKLT AZLIIZVPGV HQLZK6QT0N FKDAPSGRDS SZTRLPLAFY 240 

YOUYAYACWP YliKFVTBEVE HPEKTIPLAI CISMAITIGV YVLTNVAYFT TINAEBLLLS 300 

NAVAVTFSER LLGMFSLAVP IFVALSCFGS MMGGVPAVSR LFYVASRBGH LPEILSMIHV 360 

RKHTPLPAVI VLHPLTMIML FSGDLDSLLM FLSFARHLFI GLAVAGLIYL RYKCPDMHRP 420 

FKVPIiFIPAL FSFTCLFMVA LSLYSDPFST GIGFVITLTG VPAYYLFIIH DKKPRWFRIM 480 
SEKITRTU2X ZIiEWPEEDK Ii 



Seq ID NO: 396 DKA sequence 
Nucleic Acid Accession #: NM_006528 
Coding sequence t 57 . . 764 ~ 

1 11 21 31 41 51 

I i I I I I 

GCCGCX3160S GCTTTCrCGG A06CCTTG0C CAGOGGGCCG CCCGACOCOC TGCACCATGG 60 

ACOOQGCTOS C0CCCTGG66 CTGT06ATTC TGCTGCTTTT CCTGACSGGAG GCTGCACTGQ 120 

GCGAT6CT6C TCAGGA6CCA ACAGQAAATA AOGOGGASAT CTGTCTCCTG CCCCTAQVCT 180 

ACGGACCCTQ OOGGGCCCTA CTTCPOOGTT ACTACTAOGA CAGGTACA05 CAGAGCTGCX: 240 

GOCAGTTCCT GTJ^COGGGGC TGCGAGGGCA AOGCCAACAA TTTCTACAOC TGGGAGGCTT 300 

GO6A0GATGC TTGCTGGAGG ATAGAAAAAG TTCCCAAAGT TTGCOGGCTQ CAAGTGA6T6 360 

T6GA0GACCA GTGTCAGGGG TCCACAGAAA AGTAITrCTT TAATCTAAGr TCCAroAOVT 420 

GTGAAAAATT CTTTTCCGGT GGGTGTCACC GGAACOSQAT TGAGAACAOG TTTCCAGATQ 480 

AAGCTACTTG TATGGOCTTC TGOGCACCAA AGAAAATTOC ATCATTTTGC TACAGTCCAA 540 

AAGATQA6GG ACTGTGCTCT GCCAATGTGA CTCGCTATTA TTTTAA TCCA AGATACAGAA 600 

CCTGTGAT6C TTTCACCTAT ACTG6CTGTG GAGG6AAT6A CAArAACTTT GTTAGCAGG6 660 



332 



wo 02/086443 

AGGATTGCTUl AOGTGCATGT GCAAAAGCTT TGAAAAAGAA AAAGAAGATG CCAAAGCTTC 720 

GCTTTGCCAG TAGAATCOGO AAAATTOGGA AGAAGCAATT TTAAACATTC TTAATATGTC 780 

ATCTTGTTTG TCTTTATGGC TTATTTGCCT TTATGGTTGT ATCTGAAGAA TAATATGACA 840 

GCATGAGGAA ACAAATCATT GGTGATTTAT TCACCAGTTT TTATTAKTAC AA0TCACTTT 900 

TTCAAAAATT TOGATTTTTT TATATATAAC TAGCT6CTAT TCAAATGTGA GTCTACCATT 960 

TTTAATTTAT G6TTGAACTG TTTGTGAGAC GAATTCTTGC AATGCATAAG ATATAAAAGC 1020 

AAATATGACT CACTCATTTC TTGGGGTCXTT ATTCCTGATT TCAGAAGAGG ATCATAACTG 1080 

AAACAACATA AGAGAATATA ATCATGTGCT TTTAACATAT TTQAGAAIAA AAAGGACTAG 1140 
CC 

Seq ID NOi 397 Protein sequence 
Protein Accession #: KP_006519 

1 11 21 31 41 51 

I I I I I I 

MDPARPL6LS lUiLFLTEAA LGDAAQSPIG NNAEICLLPIi DYGPCRALLL RyyyORYTQS 60 

CRQFLYGGCB GMAMrPYTWE ACDDACHaiB KVPKVCRLQV SVDDQCBGST BKYFFNLSSM 120 

TCEKFFSG6C HBNRZENRFP DBATOfSFCA PKKIPSFCYS PKDBGLCSAN VTRYYETIIPRY 180 
RTCDAPTYTO OGGSDMNFVS RBDCKRACAK ALKKKKKMPR LRFASRXRKZ RRKQP 

Seq ZD NOi 398 DMA sequence 

Slucleic Acid Accession #: HN_001508.1 

Coding sequence: 1..1361 

1 11 21 31 41 51 

I I I I I i 

ATGGCTTCAC CCAGCCTCCC GGGCAGTGAC TOCTCCCAAA TCATT6ATCA CASTCAT6TC 60 

CC05AGTTTC AGGTGGCCAC CTQGATCAAA ATCACCCTTA TTCTGGTGTA CCTQATCATC 120 

TTOTPGATGG GCCTTCTGGG GAACAGCGTC ACCATTCGGG TCACCCAGGT GCTGCAGAAQ 180 

AAAGGATACT T6CA6AAGGA OGTGACAGAC CACAT6GT6A GTTTG6CTTG CTCGGACATC 240 

TTGOTGTTCC TCATOGGCAT 00CCATG6A6 TTCTACAGCA TCATCTOGAA TCOCCTGACC 300 

AOGTOC3M3CT ACACCCTGTC CTGCAAGCTG CACACTTTOC TCTTOaAOOC CTGCAGCTAC 360 

GCTAOGCTGC TGCAOGTGCT GACGCTCAGC TTTGAGOGCT ACATOGCCAT CTGTCACCCC 420 

TTCAGGTACA AGGCTGTGTC GGGACCTTGC C3U3GTGAAGC TGCTGATTGG CTTCGTCTGG 480 

GTCACCTCOG COCTGGTGOC ACTGCCCTTG CrGTTTGCXA TGGGTACTGA 6TACC0CCTG 540 

GTGAAOnGC GGAGGCACOS GOGTCTCACT TGCAAC06CT OCAGCACCOG CCACCA06AG 600 

CAOCCCGAGA CCTCCAATAT OTCCATCTGT ACCSkACCTCT OCAGCCGCTG GACCGTGTTC 660 

CAGTCCAGCA TCTTCGGCGC CTTCGTGGTC TACCTCGTGG TCCTGCTCTC CGTAGCCTTC 720 

ATGTGCTGGA ACATGATGCA GGTGCTCAT3 AAAAOOCAGA AGGGCTCGCT GGCX3KMGGC 780 

ACGCGGCCTC CGCA6CT6A0 GAAGTCOSAG AG0GAAGA6A GCA6GAC0GC CAGGAGGCAG 640 

ACCATCATCT TOCTGAGGCT G A TTQTTGTG ACATTGGCOS TATGCTOQAT 6G0CAACCAG 900 

ATTOGGAGGA TCAT6GCTGC GGCCAAACCC AAGCA06ACT GGAOSAGGTC CTACTTCCGG 960 

GCGTACATGA TCCTCCTCCC CTTCTCGGAG AOGTTTTTCT ACCTCAGCTC GGTCATCAAC 1020 

CXXJCTCCTGT ACAOGGTGTC CTCXK3^GCAG TTTCGGOGGG TGTTCGTGCA GGTGCTGT6C 1080 

TGCCGCCTGT CGCTGCAGCA OQCCAACCAC GAGAACCGCC TGOGCGTACA TGCOCACTCC 1140 

ACCACCGACA GOGCCCXJCTT TOTGCACCCC COGT TGCTC T T0GC3GTCC00 GOOOCAOTCC 1200 

TC7GCAAGGA GAACTGAGAA GATTTTCTTA A6CACT7TTC AGAGCGAGGC 0QAGCCCCA6 1260 

TCTAAGTCCC AOTCATTGAG TCTaSAGTOl CTAGAGCCCA ACTCAG606C 6AAACCAGCC 1320 
AATTCTGCTG CAGAGAATGG TTTTCAGGAO CATGAAGTTT GA 

Seq ID NOt 399 Protein sequence 
Protein Accession i: KP_001499.1 

1 11 21 31 41 51 

I ! i ) I I 

MASPSLPGSD CSQIIDHSHV PSPBVATWIK ITLILVYLII FVMGItLGNSV TIBVTQVXiQK 60 

KGYljQKBVTD HMVSLACSDI LVFLIOIPMB FYSIIWNPLT TSSYTLSCKL HTPLFBACSY 120 

ATIiLBVLTLS FERYIAICHP PRYKAVSGPC QVKLLIGPVH VTSALVALPL LFAMGTEYPL 180 

VHVPSHRGLT OJRBSTRHHE QPETSNMSIC TNLSSRWTVF QSSIFGAFW YLWLLSVAP 240 

MCWNMMQVLM ICSQKGSLAGG TRPPQLRKSE SEESRTARRQ TIIFIiRLIW TLAVCWMPNQ 300 

IRRIMAAAKP KHDWTRSYFR AYMILLPPSE TPFYLSSVXN PLLYTVSSQQ PRRVFVCJVLC 360 

CRLSLQHAHH EKRUIVRAHS TTDSARFVQR PLLFASRRQS SARRTBKIFL STFQSEABPQ 420 
SXSQSLSLBS LEPNSGAKPA !ISAAEHGFQB HEV 

Seq ID KO: 400 DNA Sequence 

Nucleic Acid Accession #: HM_006475.1 

Coding sequence : 2 8 .. 253 8 

1 11 21 31 41 51 

I I I 1 i- I 

AACAGAACTG CAAOGGAGAG ACTCAAGATG ATTCCCTTTT TAOCCATGTT TTCTCTACTA 60 

TTGCTGCITA TTGTTAACCC TATAAACGCC AACAATCATT ATGACAAGAT CT TGGCT CAT 120 

AGTOQTATCA GGGOTOGOGA CCAAGGCCCA A A TOTCT G T G CCCTTCAACA GATTTTGGGC 180 

ACCAAAAAGA AATACTTCAO CACTTGTAAG AACTGGTATA AAAAGTCCAT CTGTGGACAG 240 

AAAAOGACTG TTTTATATGA ATGTTGCCCT GGTTATATGA GAATGGAAGG AATGAAAGGC 300 

TGCCCAGCAG TTTT6CCCAT TGACCATGTT TATGGCACTC TGG6CATCGT GGGAGCCACC 360 

ACAAOGCAGC GCTATTCT6A OOOCTCAAAA CTQAGGGAGG AGATCGAGGG AAAGGGATCC 420 

TTCACTTACT TTQCAC06A0 TAATGAOGCT TGGGACAACT TOGATTCTGA TATCCGTAGA 480 

GGTTTGGAGA GCAAOGTGAA TGTTGAATTA CTGAATGCTT TACATAGTCA CATGATTAAT 540 

AAOAGAATGT TCAOCAAGGA CTTAAAAAAT GGCATGATTA TTCCTTCAAT GTATAACAAT 600 

TTGGGGCTTT TCATTAACCA TTATCCTAAT GGGGTTOTCA CTGTTAATTG TGCTOGAATC 660 

ATOCATGGGA ACCA6ATTGC AACAAATGOT GTTGTCCATG TCATTGAC06 TGTGCTTACA 720 

CAAATTGGTA CCTGAATTCA AGACTTCATT GAAGCAGAAG ATGACCTTPC ATCTTTTAGA 780 

GCAOCTGCCA TCACATOGGA CATATTGGAG GCCCTTGGAA GAGA006TCA CTTCACACTC 840 

TTTOCTCCCA CCAAT6AGGC TTTTGAGAAA CTTCCACGAG GTGTCCTAGA AAGGTT CATG 900 

GGAGACAAAG TGGCTTCCGA AGCTCTTATG AAGTACCACA TCTTAAATAC TCTCCAGTGT 960 

TCTGAGTCTA TTATGGGAQG A6CAGTCTTT GA6AC6CTG6 AAGGAAATAC AATTGAGATA 1020 
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GGATGTGhOS GIGACM3TAT A&CAGTAAAT GGftftTCftAAIl TGGT6AACAA AAAOGATATT 1080 

GTGACAA&TA ATGGTGTGAT CCATTTGATT GATCAGGTOC TAATTCCTGA TTCTCOC A AA 1140 

CAAGTTATT6 AGCTGGCTG6 AAA^^AGCAA ACCACCTTCA GGGATCTTGT GGOOOU^TTA 1200 

GGCTTGGCAT CTGCTCTGAG GCCAGATGGA GAATAC3«rn TGCTGGCACC TGTGAATAAT 1260 

GCATTTTCTO ATGAIACTCT CAGCATGGTT CAGOGCCTCC TTAAATTAAT TCTGCAGAAT 1320 

CACATATTGA AAGTAAAAGT TGGCCTTAAT GA6CTTTACA AOGGQCAAAT ACTGGAAACC 1380 

ATCGGAOGCA AACA6CTCAG AGTC ntX T JA TATOGXACAO CTGTCTGCAT T6AAAATTCA 1440 

TGCATGGAGA AAGGGAGTAA GCAAGGGAGA AACGGTGOSA TTCACATATT CCXK3GAGATC ISOO 

ATCAAGCCAG CAGAGAAATC CCTCCATQAA AAGTTAAAAC AAGATAAGCG CTTTAGCACC 1560 

TTCCTCA6CC TACTTQAA6C TGCACSACTTG AAAGAGCTOC TGACACAACC TGGA6ACT6Q 1620 

ACATTATTT6 TGCCAAOCAA TGATGCTTTT AAGGGAATGA CTAGTGAAGA AAAAGAAATT 1680 

CTGATAOGGG ACAAAAATGC TCTTCAAAAC ATCATTCTTT ATCACCTGAC AOCAOGAGTT 1740 

TTCATTGGAA AAGGATTTGA ACCTGGTGTT ACTAACATTT TAAAGACCAC ACAAGGAAGC 1800 

AAAATCTTTC TGAAAGAAGT AAATGATACA CTTCTGGTGA ATGAATTGAA ATCAAAAGAA 1860 

TCTGACATCA TGACAACAAA TGGTGTAATT CATGTTGTAG ATAAACTCCT CTATOCAGCA 1920 

GACACACCTG TTGGAAATGA TCAACTGCTG GAAATACTTA ATAAATTAAT CAAATACATC 1980 

CAAATTAAGT TTGTTCGTGG TAGCACCTTC AAAGAAATOC COGTGACTGT CTATACAACT 2040 

AAAATTATAA CCAAAGTTGT GGAACCAAAA ATTAAAGT6A TTGAAGGCAG TCTTCAGCCT 2100 

ATTATCAAAA CTGAAGGACC CACACTAACA AAAGTCAAAA TTGAAGGTGA ACCTGAATTC 2160 

AGACTGATTA AAGAAGGTGA AACAATAACT GAAGTGATCC ATGGAGAGCC AATTATTAAA 2220 

AAATACACCA AAATCATTGA TGGAGTGCCT GTGGAAATAA CTGAAAAAGA GACAOGAGAA 2280 

QAAOGAATCA TTACftGGTCC TGAAATAAAA TACACTAGGA TTTCTACTGG AGGTGGAGAA 2340 

ACAGAAGAAA CTCIGAAGAA ATT6TTACAA GAAGA6GTCA CXIAAGGTCAC CAAATTCATT 2400 

GAAG0T6GTG ATGGTCATTT ATTTGAAGAT GAAGAAATTA AAA6ACTGCT TCAGGGAGAC 2460 

ACACCOGTGA GGAAGTTGCA AGCCAACAAA AAAGTTCAAO GTTCTAGAA6 AOGATTAAGG 2520 

GAAGGTCXjTT CTCAGTGAAA ATCCAAAAAC CAGAAAAAAA TGTTTATACA ACCCTAAGTC 2580 

AATAAOCTGA CCTTAGAAAA TTGTGAGAGC CAAGTTGACT TCAGGAAC7G AAACATCAGC 2640 

ACAAAGAAGC AATCATCAAA TAATTCTGAA CACAAATTTA ATATTTTTTr TTCTGAATGA 2700 

GAAACATGAO 66AAATTGTO GAGTTAGCCT CCTGTGGTAA AGGAATTGAA 6AAAATATAA 2760 

CACCTTACAC CCTTTTTCAT CTTGACATTA AAAGTTCTGG CTAACTTTGG AATCCATTAG 2820 

AGAAAAATCC TTGTCACCAQ ATTCATTACA ATTCAAATOG AAGAGTTGTG AACTGTTATC 2880 

CCATT6AAAA GACCGAGOCT TGTATGTATO TTAT6GATAC ATAAAATGCA OGCAAGCCAT 2940 

TATCrCTCCA TGGGAAGCTA AGTTATAAAA ATA0GT6CTT GGTGTACAAA ACTTT TTATA 3000 

TCAAAAOGCT TT8CACATTT CTATATGA6T GG6TTTACT6 GCAAATTATO TTATTTTTTA 3060 

CAACTAATTT TGTACTCTCA GAATGTTTGT CATATGCTTC TTGCAATGCA TATTTTTTAA 3120 

TCTCAAAC3GT TTCAATAAAA CCATTTTTCA GATATAAAGA GAATTACTTC AAATT6AGTA 3180 
ATTCAGAAAA ACTCAAGATT TAAGTTAAAA AGTGGTTTGG ACTTGGGAA 

Seq ID KO: 401 Protein sequence 
Protein AccesBion i: lfP_006466.l 

1 11 21 31 41 51 

I I I I I I 

KIPFLPMF8L LLLIiXVMPXN AHNKYDKIXA ESRIRGROQG PNVCALQQIIi GTKKKYFSTC 60 

IQIWVKKSIOO QKTTVLYECC PGYMRMBGMK GCPAVWIDH VYGTLGIVGA TTTQRYSDAS 120 

KLREEIBGKG SFTYFAPSNE AHDNLDSDIR RGLBSNVNVE LLMAIiHSHMI NKHMLTKDZJC 180 

IIGMIIPSMYN iniGLPimiYP NGWTVNCAR ZIBGITQXAtM 6WKVIDRVL TQIGT5IQDF 240 

IBAEDDLSSF RAAAZT8DZL BAXiOROGBFT LFAP1NEA7B KZiPRGVLBRF MGDXVASEAL 300 

MKyHlUiTLQ CSBSZMGQAV FBTLEGNTIB Z6CD6D8ZTV NGZKMVNKKD ZVTOHGVZEL 360 

ZDQVLIPDSA KQVZELAGKQ QTTPTDLVAQ LGLASALRPD GEYTLLAPVN NAPSDDTLSM 420 

VQRLLKLIIiQ NHILKVKVGL NELYNGQILE TIGGKQLRVF VYRTAVCIEN SCMEKGSKQG 480 

RKGAIHIFRE ZZKPABKSLH EKLKQDKRFS TFLSLLEAAD LKELLTQPGO HTLFVPTllDA 540 

FKGHTSEBKE ZLZRDKNALQ NZZLYHIiTPa VFZGKGFBPG VTNZLICrTQG SKZFLKEVND 600 

TLLVNELKSK BSDZNTTNGV ZEWDXLLYP ADTPVGNDQL hElJJSIKLlKf ZQZKFVR6ST 660 

PKEIPVTVYT TKZITKWEP KIKVIBGSLQ PIIKTEGPTL TKVKIEGEPB FRLZKBGETZ 720 

TEVIHGEPIZ KKYTKIIDGV PVBITEKETR EERIITtSPBI KYTRISTGGG ETEETUCKLL 780 
QEEVTKVTKF ZBG6DGHLFB DBBIKRLLQG DTFVRKLQAN KKVQGSRRRL RBGRSQ 

Seq ID MOs 403 OKA sequence 
Nucleic Acid Accession St NM_002416 
Ooding sequence I 40.. 417 

1 11 21 31 41 51 

I I t I I I 

ATOCAATACA GGAGTGACTT GGAACTCCAT TCTATCACTA TGAAGAAAAS TGGTGTTCTT 60 

TTCCTCTTGG GCATCATCTT GCTGGTTCPG ATTGGAGTGC AAOGAACCCC AGTAGTGAGA 120 

AAGG6T06CT GTTOCTGCAT CAGCACCAAC CAA66GACTA TCCACCTACA ATCCTTGAAA 180 

GACCTTAAAC AATTTGOCCC AAGOOCTTCC TQOGAGAAAA TTGAAATCAT TGCTACACTS 240 

AAGAAT66A6 TTCAAACATG TCTAAACCCA GATTCAGCAG ATGT6AA0GA ACT6ATTAAA 300 

AAGTGGGASA AACAGGTCAG OCAAAAGAAA AAGCAAAAGA ATGGGAAAAA ACATCAAAAA 360 

AAGAAAGTTC T6AAAGTTCG AAAATCTCAA 06TTCT0GTC AAAAGAAGAC TACATAAGAG 420 

ACCACTTCAC CAATAAGTAT TCTGTGITAA AAATGTTCTA TTTTAATTAT ACOOCTATCA 480 

TTOCAAAGGA 6GATG0CATA TAATACAAAO OCTTATTAAT TTGACTAGAA AATTTAAAAC 540 

ATTACTCTGA A A TTXS TA ACT AAAGTTAGAA AOTTGATTTT AAGAATCCAA A0GTTAA6AA 600 

TTGTTAAAGG CTATGATTGT Cm 'G' n C i 'T CTACCACCCA CCAGTTGAAT TTCATCATGC 660 

TTAAGGCCAT GATTTTAGCA ATACCCATGT CTACACAGAT GTTCACCCAA CCACATCCCA 720 

CTCACAACAG CTGCCTGGAA GAGCAGCXrCT AGGCTTCCAC GTACTGGA8C CTCCA6A6A6 780 

TATCTGAGOC ACATGTCAGC AAOTGCZAAa OCTXSTTAOa^ TGCTOGTGAO OCAAGCASTT 840 

TGAAATTGAG CTGGACCTCA OCAAGCTGCT GTGGCCATCA AOCTCTGTAT TT6AATCA6C 900 

CTACAGGCCT CACACACAAT GTGTCTGAGA GATTCAT6CT GATTGTTATT GGGTATCAOC 960 

ACTGGAGATC ACCAGTGTGT GGCTTTCAGA GCCTCCTTTC TGGCTTT66A AGCX3VTGTQA 1020 

TTCCATCTTG CCOGCTCAGG CTGACCACTT TATTTCTTTT TGTTCCCCTT TGCTTCATTC 1080 

AAGTCAGCTC TTCTCCATCC TAOCACAATG CAGTGCCTTT CTTCTCmA GTGGAOCTGT 1140 

CATATGCTCT GATTTATCTG AGTCAACTOC TTTCTCATCT TQTGCCCAAC ACC0CACA6A 1200 

AGTGCTTTCT TCTCCCAATT CATCCTCACT CAGTCCASCT TAGTTCAAGT CCTGCCTCTT 1260 

AAATAAACCT TTTTGQACAC ACAAATTATC TTAAAACTOC TGTTTCACTT GGTTCAGTAC 1320 

CACATGG6TQ AACACTCAAT GGTTAACIAA TTCTTGG6TG TTTATCCTAT CTCTOCAACC 1380 
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AQATTGTCAG CTOCTTQAGG GCAAGAOCCA CAOTATATTT UCCWAYl ' C ' r TGCACAiSTGC 1440 

CTAATAATAC TGTGGAACTA GGTTTTAATA ATTTTTTAAT TSATGTTGTT ATGGCCAGGA X500 

TGGCAAOCAG ACCATTGTCT CAGAGCAGGT GCTC3GCTCTT TCCTGGCTAC TCCATGTTGG 1560 

CTAGCCTCTG GTAACCTCTT ACTTATTATC TTCAGGACAC TCACTACAGG GACCAGGGAT X620 

GATGCAACAT OCTTGTCTTT TTATGACftGG ATGTTTCCTC AGCTTCTCCA ACAATAAGAA 1680 

GCA0GTG6TA AAACACTTCC GOATATTCTG GACTGTTT7T AAAAAATATA CaOT TTAOO S 1740 

AAAATCATAT AATCTTACAA TQAAAAGGAC TTTATAGATC AGCCAGTGAC CAACCTTTTC 1800 

CCAACCATAC AAAAATTCCT TTTCCOSAAG GAAAAGGGCT TTCTCAATAA GCCTCAGCTT 1860 

TCTAAGATCT AACAAGATAG CCACCGAGAT CCTTATOGAA ACTCATTTTA GGCAAATATG 1920 

A Ori ' ilA TTG TOCXSTTTACT TGTTTCAGAG TTTGTArCGT GATTATCAAT TACCACACCA 1980 

TCTCCCAIGA AGftAAOGGAA OSGTGAAGTA CTAA606CTA GAGGAAGCAG CCAAGTGGGT 2040 

TAGTGGAA6C ATGATTG6TG C0CA6TTAGC CTCTGCAGGA T8TGGAAACC TCCTTCCAGG 2100 

GGAGGTTCAG TGAATTGTGT AGGAGAGGTT GTCTGTGGCC AGAATTTAAA CCTATACTCA 2160 

CTTTCCCAAA TTGAATCACT GCTCACACTG CTGATGATTT AGAGTGCTGT CCGGTGGAGA 2220 

TOOCACCCGA AOGTCTTATC TAATCATGAA ACTCCCTAGT TCCTTCATGT AACTTCCCTG 2280 

AAAAATCTAA GTGrTTCATA AATTTGA6A6 TCTGTGACCC ACTTACCTTQ CATCTCACAG 2340 

GTAGACA6TA TATAACTAAC AACCAAAGAC TACATATTGT CACT6ACACA aUXJFTATAA 2400 

TCATTTATCA TATATATACA TACATGCATA CACTCTCAAA GCAAAXAATT TTTCACTTCA 2460 

AAACAGTATT GACTTGTATA CCTTGTAATT TGAAATATTT TCITTGTTAA AATAGAATGG 2520 
TATCAATAAA TAGACCATTA ATCAG 

Seq ID HO: 403 Protein sequence 
Protein Accession ftt NP_002407 

1 11 21 31 41 SI 

I } I ) I I 

NKKSGVLFLL GIZLLVIiIGV QGTFWRKGR CSCXSTNQGT IHLQSLKDLK QPAPSPSCEK 60 

ZBZIATLKNG VQTCLNPD5A DVXELIXXWB KQVSQKKKQK NGKKHQKKKV LKVRXSQRSR 120 

QXKTT 

Seq ID NO: 404 DNA sequence 

Nucleic Acid Accession fti NM_006670 

Coding sequence: 85.. 1347 

1 11 21 31 41 51 

I I I I I I 

CX»GCTG6CX3 CGCTCOGGGC CCA6CCTCCC GAGCCTTOGG AGOSGGCGCC GTCCCAGCCC 60 

AGCTCCX3066 AAA06CGAGC OGCGATGCX:? G0GGGGT6CT CCOGGGGCCC CGC0GCCGG6 120 

GA060GGGTC TG0GGCTG6C 6C36ACTAOGG CTGGTACTCC TGGGCTGGGT CTCCTCGTCT 180 

TCTOCCACCT CCTOGGCATC CTCCTTCTCC TCCTCGGCGC C3GTTCCTGGC TTCOGCCGTG 240 

TCCGCCCAOC CCCOGCTGCC GGACCAGTGC CCCGOGCTGT GOGAGTGCTC CGAGGCAGOS 300 

0GCACA5TCA AGTGC6TTAA COGCAATCTG ACOGAGGTGC OCACGGACCT GCCGGCCTAC 360 

GTGOGCAACC TCTTCCTTAC OGGCAACCAa CTGGCC3QTGC TGCCTGCGQG C3GCCTTGG0C 420 

OSCOGGCCGC OSCT GG CGGA GCTGGC06G6 CTCAACCTCA 60GGCAGC00 CCTGOAOGAG 480 

GTGCGCGCXX3 GCGCCTTCGA GCATCTGOCC AGCCTGOGCC AGCTOGACCT CABCCACAAC 540 

CCACTGGCOQ ACCTCAGTCC CTTCGCTTTC TCGGGCAGCA ATGCCAGCGT CTOGGCCCCC 600 

AGTCCCCTTG TGGAACTGAT CCTGAACCAC ATC6TGCCCC CTGAAGATGA GGGGCAGAAC 660 

OGGAGCTTOO AGGGCAT66T GGT66CG6CC CTGCTG60G0 GCOGIGCACT GCAQGGOCTC 720 

06CC6CTT66 AGCTGGCCAG CAACCACTTC CTTTACCTOC GGOGGGATGT GCTGGCCCAA 780 

CTGCCCAGCC TCAGGCACCT 6GACTTAAGT AATAATTOOC TGGTGAGCCT GACCTACGTG 840 

TCCTTCOGCA ACCTGACACA TCTAGAAAGC CTCCACCTGG AGGACAATGC CCTCAAGGTC 900 

CTTCACAAT6 GCACCCTGGC T6AGTT6CAA 6GTCTAO0CC ACATTAGGGT TTTCCTGGAC 960 

AAChATCCCT OOGTCTGCGA CT6CCACA1Q 6CAGACATG6 TGACCIGGCT CAAOGAAACA 1020 

GAGGTAGTGC AGG6CAAAGA C066CTCA0C T6T6CATATC CX3GAAAAAAT GAGGAATOGG 1080 

GTCCTCTTGG AACTCAACAG TGCTGACCTG GACTGTGACC OGATTCTTCC CCCATCCCTG 1140 

CAAACCTCTT ATGTCTTCXrr GGGTATTGTT TTAGCCCT6A TAGGCGCTAT TTTCCTCCTG 1200 

GTTTTGTATT TGAACX^GCAA GGGGATAAAA AA6TGGATGC ATAACATCAQ AGATGCCTGC 1260 

AGQGATCACA TGGAAGGGTA TCATTACAGA TATGAAATCA AT6066AC0C CAGATTAACA 1320 

AACCTCAGTT CTAACTCGGA- TGTCT6AGAA ATATTAGAGG ACAGACCAAO GACAACTCTG 1380 

CATGAGATGT AGACTTAAGC TTTATCCCTA CTAGGCTTGC TCCACTTTCA TCCTCCACTA 1440 

TAGATACAAC GGACTTTGAC TAAAAGCAGT GAAGGGGATT TGCTTCCTTG TTATGTAAAG 1500 

TTTCTCGGT6 T6TTCTGTTA AT6TAA6AC3Q ATGAACAGTT GTGTATAGTG TTTTACCCTC 1560 

TTCTTTTTCT TQGAAC7CCT CAACAOSTAT GGA6G6AT7T T TCAG GTTTC AGCATGAACA 1620 

TGGGCTTCTT OCTGTCTGTC TCTCTCTCA6 TACAGTTCAA GGTGTAGCAA 6TGTACCCAC 1680 

ACAGATAGCA TTCAACAAAA GCTGCCTCAA CTTTTTOSAG AAAAATACTT TATTCATAAA 1740 

TATCAGTTTT ATTCTCATGT ACCTAAGTTG TGQAGAAAAT AATIGCATCC TATAAACTGC 1800 

CTGCAGACGT TASCAGGCTC TTCAAAATAA CTCCATGGTO CACAGGAGCA CCTGCATCCA 1860 

AGAGCATGCT TACATTTTAC TOTTCTQCAT ATTACAAAAA ATAACTTGCA ACTTCATAAC 1920 

TTCTTTGACA AAGTAAATTA CTTTTTTGAT TGCA6TTTAT ATGAAAATGT ACTGATTTTT 1980 

TTTTAATAAA CTGCATC6AG ATCCAACOGA CTQAATTGTT AAAAAAAAAA AAAAATAAAO 2040 
ATTCTTAAAA GAA 

Seq ZD NO I 405 Protein sequence 
Protein Accession 9t NP_006661 

1 11 21 31 41 51 

I 1 I I I I 

MPQ6CSRGPA ACaXSRIiRIAR LALVLLGfiIVS SSSPTSSASS FSSSAPFLAS AVSAQPPZ.PD 60 

QCPALCBCSB AABTVKCVNR NLTBVFTDLP AYVRNLFLTG NQLAVLPAGA FARRPPLAEL 120 

AALNLS6SRL DCVHAGAFGH LPSLRQLDLS HNPLADLSPP AFSGSNASVS APSPLVELXL 180 

NHIVPPBDER QHB5PE(3^ AALLAGRALQ 6LRRLELASN BFLYLPRDVIi AQLPSLRBU) 240 

LSNNSLVSLT YVSFRNLTBL ESLBLEOIAL KVLBNGTZiAE LQGLFHIRVF LONNpnVCDC 300 

HMADKVTtTLR ETEWQGKDR LTCAYPEKKR NRVLLELNSA DUXDPZLPP SLQTSYVFLG 360 
IVLALIQAIF LLVLYIiNRHQ IKKHKKNUtD ACSDHMBGYB YRYBIHADPR LTNLSSHSDV 

Seq ZD NO: 406 DNA sequence 

Nucleic Acid Accession «: Bos sequence 
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Coding sequence i 1 . . 927 



1 11 21 

•\ i I 

AT6CCTGGGG GGTGCTCCC36 GGG0CC06CX 
CTA6CBCT0G TACTOCTGOG CTOGGTCTCC 
TTCTCCTOCT CGGOGCOGTT CCTOGCTTOC 
CAGTGCCCCG CGCTGTGCGA GTGCTCCX3AG 
AATCTGACCG AGGTGCCCAC GGACCTGCCC 
AACCACCTGG CCAGCAACCA CTTCCTTTAC 
AGCXrrCAGGC AOCTGGACTT AAGTAATAAT 
0SCAA0CT6A CACATCTAGA AAGCCTCCAC 
AATGGCACCC TGGCTGAOTT GCAAGGTCTA 
CCCTOGGTCT GCGACTGCCA CATGGCAGAC 
GTGCAGGGCA AAGACCGGCT CACCTGTG^ 
TTGGAACTCA ACAGTGCTGA OCTGGACTGT 
TCTTATGTCT TCCTGGOTAT TGTTTTAOOC 
TATTTGAACC GCAAGGGGAT AAAAAAGTGG 
CACATGGAAG GGTATCATTA CAGATATGAA 
A6TTCTAACT 06GATGTCCT OSAGTGA 



31 41 51 

I I I 

GCXX3GGGA06 OGOGTCTGCXS GCTG6G6CGA 60 

'ItgAWrCfC OCACCTOCT C GGCATCCTCC 120 

GC0GT6TCCG OOCAGCOCCC GCTGCOGGAC 160 

GCA6C606CA CAGTCAAGTG CGTTAAC06C 240 

GCCTACGTGC GCAACCTCTT CXTTACOGGC 300 

CTGC06066G ATGTGCT6GC 0CAACT6CCC 360 

TOGCTGOTGA G0CT6A0CTA GGTGTCCTTC 420 

CTGGAGGACA ATGCCCTCAA GGTCCTTCAC 460 

CCXXSVCATTA GGGTTTTCCT GGACAACAAT 540 

ATGGTGACCT GGCTCAAGGA AACAGAGGTA 600 

TATCGGGAAA AAATGAGGAA TOGGGTCCTC 660 

GACCGGATTC TTCCCCCATC OCTOCAAACC 720 

CTGATAGGOS CTATTTTCCT OCrQGTTTTG 780 

ATGCATAACA TCAGAGATGC CTGCAGGGAT 840 

ATCAATGCG6 ACCGCAfiATT AACAAACCTC 900 



Seq ID NOt 407 Protein sequence 
Protein Accession ft: Bos sequence 

1 11 21 31 41 51 

I I I I I I 

MPGGCSRGPA AGDGRLRLAS LALVLLGWVS SSSPTSSASS FSSSAPFIAS AVSAQPPLPD 60 

QCPALCECSB AARTVKCVNR NLTEVPTDIiP AYVRNLFLTG NQLASNHFLY LPRDVLAQLP 120 

SLRHLDLSKN SLVSLTYVSP RMLTBLESLH LEDZVUJCVia NGTLAELQGL FRIRVFLDNN 180 

PWVCDCKMAD MVTHLKETEV VQGRDRI.TGA YPSKMRNRVL LELKSADLOC DPXIiPPSLQT 240 

SYVFLGIVLA LIGAXFLLVL YUIBKGXXXIf MHKIRDACRD HKEGyHYRYB ZHADPRLTNL 300 

SSNSZ3VLE 



Seq ID NO: 408 DMA aequence 

Nucleic Acid Accession NM_000095.1 

Coding sequence t 2 6 .. 22 99 



1 11 21 31 41 51 

I I I I I I 

CAC5CACCCAG CTCCC06CCA COOCCATGGT CX:CCX3AC3VCC GCCTGOSTTC TTCTGCTCAC 60 

CCTGGCTGCC CTCGGOGCGT CCGGACAGGG CCAGAGCCCG TTGGGCTCAG ACCTGGGCCC 120 

GCA6ATGCTT COGGAACTGC AGGAAACCAA CGOGGOGCTG CAGGA06TGC GG6ACTGGCT 180 

G06GCAGCAG GTCAGGGAGA TCA06TTCCT GAAAAACACO aTaATGGAOT GT6AC6067G 240 

06GGATGCA6 CAGTCAGTAC GCAC06GCCT ACCCA608TO OGGOCCCTQC T0CACTGC6C 300 

GCCOGGCTTC TGCTTCCCOG GOTrGGCCTG CATCCAGAOG GAGAGCGGOG GCCGCTGOGG 360 

CCCCTGCCCC GOGGGCTTCA CGGGCAACQG CTOGCACTGC ACCGACGTCA ACGAGTGCAA 420 

CGCCCACCCC TQCTTCCCCC GA6TC0GCTG TATCAACACC AGCCCGGGGT TCCGCTQCGA 480 

G6CTTGCCCG CGGGGGTACA GOGGCCOCAC CCACCAOGGC GTGGGGCTOa CTTTOQCCAA 540 

GGCCAACAAG CAGGTTT6CA COGACATCAA OQAQTOTQAO ACC6GGCAAC ATAACTGGGT 600 

CCCCAACTCC GTGTGCATCA ACACCCGGGG CTCCTTCCAO TGOGGCXXGT GCCAGCCCGG 660 

CTTCGTGGOC GACCAQQCGT CCGGCTGCCA GCGCGGOGCA CAGCGCTTCT GCCCOGACQG 720 

CT06CCCA6C GAGTGCCACG AGCATGCAGA CTGCGTCCTA GAGCGCGATG GCTOGCGGTC 780 

GTGCGTGTGT OGGGTTQGCT GGGCCG6CAA OGGGATCCTC TGTGGTGGC9 ACACT6ACCT 640 

AGAOGGCTTC COGGAOGAGA AGCTGC6CTG CCCGGAGCOO CA6TG0CGTA AOGACMCTG 90O 

GGTGACTGTG CCCAACTCAG GGGAGGAGGA TGT6GACO0C GATGGCATCG GAGAC6CCTG 960 

CGATCCGGAT GCOGAOGGGQ ACGGGGTCCC CAATGAAAAG GACAACTGCC CGCTGGTGOG 1020 

QAACCCAGAC CAGCGCAACA GGGAGGAGGA CAAGTGGGGC GAT6C6TG0G ACAACTGC06 1080 

GTCCCAGAAG AAGQACGACC AAAAGGACAC AGAOCAOGAC GGCCG6GG09 ATG00TGC8A 1140 

OGAOGACATC GAOGGOGACC GGATCOSCAA CCAG6C0GAC AACTGCOCTA G6GTAC0CAA 1200 

CTCAGACCAG AAGGACAGTG ATGGCOATGG TATAGGG6AT OCCTGtGACA ACTGTCCCCA 1260 

GAAGAGC31AC CCGGATCAGG CGGATGTGGA CCAOGACTTT GTGGGAGATG CTTQTGACAG 1320 

OSATCAAGAC CAGGATGGAG ACGGACATCA GGACTCTOSG GACAACTGTC CCA0GG7GCC 1380 

TAACAGTOCC CAGGAOGACT CAGACCACGA T66CCAGGGT GAT6CCIG08 AGGA06ACQA 1440 

G8ACAAT6AC 66AGTCCCT6 ACA6TCGGGA CAACTQCOQC CTG Q TQOCTA ACCCOGGCCA 1500 

GGAGGACGQG GACAGGGAC6 G06TGGGCGA 0QTGT6CCAO GAOGACTTTQ ATGCAGACAA 1560 

GGTGGTAGAC AAGATCGACG T6TGTC0GGA GAACGCTGAA GTCACGCTCA COGACTTCAG 1620 

GGCCTTCCAG ACAGT0GT6C TGGACCC6GA G66TGA0606 CAGATTGACC CCAACTGGGT 1680 

GGTGCTCAAC CAGGGAAGOQ AGATOSTOCA GACAATGAAC AGGGAOCCAO GCCTGGCTQT 1740 

6GGTTACACT 6CCTTCAATG 60GTG6ACTT OGAGG6CA0G TTCCATGTGA ACACOGTCAC 1600 

GGATCAC6AC TATG0GG6CT TCATCTTTGG CTACCAOGAC AGCTCCAGCT TCTAOGTGGT 1660 

CATGTGGAAG CAGATGGAGC AAACGTATTG OCAGGOQAAC CCCTTCOGTG CTGTGGCOGA 1920 

GCCTGGCATC CAACTCAAG6 CTGTGAAGTC TTCCACAGOC CC06G6GAAC AGCTGCG6AA I960 

03CTCTGTG6 CATACAGQAG ACACA6AGTC CCAGOTGCGG CTGCT6TG G A AGGACCOGCQ 2040 

AAAOGTGGGT TGGAA6GACA AQAAGTCCTA TCGTT06TTC CTGCA6CACC 6GCCCCAAGT 2100 

GGGCTACATC AGGGTGCGAT TCTATGAGGG CCCTGAGCTG GTGGCOGACA QCAACGTGGT 2160 

CTTGGACACA ACCATGOGGG GTGGCOGCCT GGGQGTCTTC TGCTTCTCCC AGGAGAACAT 2220 

CATCTGGGCC AACCTGGGTT ACOGCTGCAA T6ACACCATC CXAGAGGACT ATGAGACCCA 2280 

TCAOCTGOOQ GAA6CCTAGG OACCAGGGTG AG6A0CG80C G6ATQACAGC CACCCTCAOC 2340 

GGQ6CTGQAT GGGGGCTCTG CACCCAGCCC AAGG6GTGGC OGTCCTGAGG G6GAAGTGAG 2400 
AAGGGCTCAG AGAGGACAAA ATAAAGTGT6 TOTGCAaGG 



Seq ZD NO: 409 Protein sequence 
Protein Accession fti NP_00O086.1 

1 11 21 31 41 51 

I I I I i 1 

MVPDTACVX«L LTLAALGASG QGQSPIiGSDL GPQMLRELQE TNAALQZTVRD HLRQQVREIT 
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FLKNTVMECD A0C3WSVRT OLPSVBShUB CAPGFCFVGV 
NGSHCXSVNE CKAHFCFPRV RCINTSPGFR CEACPPGYSG 
IMECETGGHN CVPNSVCINT S0SFQCS90Q PGFVGDQASG 
ADCVI£RDGS RSCVCRVGHA GMGILaSHOT DLDGPFDBKL 
BSVDRDGZCD ACDPDADGDG VFKEKZSNCPXi VIQIFDQRNTD 
DTOODGROIA CDDDIDGDRI SNQADNCFRV FNSDQKDSDG 
VDKDFVGDAC DSDQZXPGZX3 KQDSRDNCFT VFHSAQEDSD 
SSHCRLVFKP OQEDADRDGV GDVCQDDFDA DKWDKXOVC 
PEGDAQXDPI} WWZjNQGREI VQTKSSDVGL AVGYTAFNGV 
FGYQDSSSFY WMNXQMEQT YWQANPFRAV ABPGXQLKAV 
ESQVRLLWKD PRNVGHKDKK SYRHPL^RP QVCYIRVRFY 
RLCVFCFSQB NZIKANLRYR CNDTZPEDVE 1BQLRQA 

Seq ID NOt 410 ONA sequence 

Nucleic Acid Accession §: KM_001565.1 

Oodlag sequence: 67.. 363 



ACIQTESGGR 
PTEQGVGLAF 
OQRGAQRFCP 
RCPBPQCBXD 

D6IGDACDNC 
HDOQGDACDD 
PENAEVTL-ro 
DFEGTFHVN7 
KSSTGP6BQL 
BGPSLVADfiN 



OGPCPAGFTG 
AKANKQVCTD 



BfCVTVPNSGQ 
CRSQKNDDQK 
PQKSNPDQAD 
ODDNDGVFDS 
FRAFQTWLD 
VTDDDyAGFZ 
RHALHHTGDT 
WLDTTMRGG 



GAGACATTCC 
AGCACCATQA 
ATTCAAGGAO 
CCTQTTAATC 
CGTC5TTGAGA 
TOGAAGGCCA 
TAAAACCAGA 
CCTCTCCCAT 
GTTACACTAA 
GGTTAATGTT 
GCTCTACTGA 
ACCTTTCCCA 
TCASAATCTC 
ACTTCAT6GA 
CATACAATTC 
CTTATTTAAT 
TTTCAGTGrTA 
TTTTAAAAAT 
TTTTCAAATA 



11 

I 

TCAATTGCTT 
ATCAAACTGC 
TACCTCTCTC 
CAAGGTCTTT 
TCATTGCTAC 
TCAAC5AATTT 
GG6GAGCAAA 
CACTTCCCTA 
AAGGT6ACCA 
CATCATCXTA 
G6TGCTAT6T 
TCTTCCAAG6 
AAATAACTAA 
CTTCCACTGC 
CAAACACATA 
GAAAGACT6T 
CATGGAATAA 
ACAGATAGAT 
AAAATGAGOT 



21 
! 

AGACATATTC 
GATTCTGATT 
TAGAACGGTA 
AGAAAAACTT 
AATGAAAAAG 
ACTGAAA6C3V 
ATGGATOCAG 
CAT6GAGTAT 
ATGATGGTCA 
AGCTATTCAG 
TCTTAGTGGA 
GTACTAAGGA 
AAGGTATGCA 
CATCCTCCCA 
CAGGAAGGTA 
ACAAAGTATA 
CAXGTAATTA 
ATATGCTCTG 
ACTCTOCTGG 



31 

I 

TGAGCCTACA 
TGCT6KTTA 
OGCTGTACCr 
GAAATTATTC 
AAGGGTGAQA 
GTTAGCAAGO 
TGCTTCCAftO 
ATGTCAAGCC 
CCAAATCA5C 
TAATAACTCT 
TGTTCTGACC 
A TCTTTCT G C 
ATCAAATCT6 
AGGGGCXXAA 
GAAATATCTG 
AGTCTTAGAT 
AGTACTATGT 
CATGTTACAT 
AAATAITAAG 



41 
I 

GCAGAGGAAC 
TCTTTCTGAC 
GCATCAGCAT 
CTGCAAGCCA 
AGAGATGTCT 
AAATGTCTAA 
GATGGACCAC 
ATAATTGTTC 
TGCTACTACT 
ACCCTGGCAC 
CT6CTTCAAA 
TTTGGGGTTT 
CTTTTTAAAB 
ATTCTTTCAG 
AAAATGTATG 
GTATATATTT 
ATGAATGAGT 
AAGATAAATG 



Seq ID NO: 411 Protein sequence 
Protein Accession it MP 001556.1 



1 
I 

QGGAGQGAGA 
CGOGGCGGAG 
GGCTGCCOGG 
CG60GGCCTC 
CCOCAAQGGG 
AATGTGCTTA 
CATTCCGGGT 
TCTGAGGGAA 
ATT6AATTAT 
AAATAGTGCT 
CT6TCASCGT 
AGCTATAATT 
CACTTCTTCT 
CTOGGTTG6C 
TTCT06CATC 
TTTTTTTATT 
CATCTGAATQ 
TTTAAATCTA 
T6GTTAGAAT 

TGTACAATTT 
CAACCTTAAA 



11 

I 

GAGGOGOGOG 
CCAGAOGCTG 
CAGCCG66AG 
CTGCTGCTCC 
AAOCAAAAGG 
CAAGQQCCAG 
ACACCTGGGA 
AGCTTT6AGG 
GGCATAGATC 
CTAAGAGTTT 
TGGTAT7TCA 
TATTTGGACC 
GTG6AAG6AC 
ACTTGTTCAG 
ATTATTGAAG 
ATGCCrrCGA 
AAAAGCAAAG 
GCATTATTCA 
ACTTTCTTCA 
TTTTTCTCTT 
GTAAATGTTA 
AAAAAAAAAA 



21 

I 

GGTGAAAGGC 
ACCACX5TTCC 
CCATGGGACC 
TGCT6CTGCA 
OGCAGCTCOG 
CAGGAGTXWC 
TCCC AGgr OG 
AGTCCTGGAC 
TTGGGAAAAT 
TGTTCAGTGG 
CATTCAATGG 
AAGGAA6CCC 
TTTGTGAAGG 
ATTACCCAAA 
AACTACXAAA 
ATGGTTCACT 
CTAAATATGT 
TTTTGCTTCA 
TAGTCACATT 
AGTATAGCAT 
AGAATTTTTT 
AAAA 



31 
I 

GCATTGATGC 
TCTCCT0G6T 
CCAGGGCOCC 
6CTGCCCG0G 
GCAGAGGQAG 
TGOTCGAGAC 
GGATGGATTC 
ACCCAACTAC 
TGOGGAGTGT 
CTCACTTOGG 
AGCTGAATGT 
TGAAATGAAT 
AATTGGTGCT 
A0QA6ATGCT 
ATAAATGCTT 
TAAATGACAT 
TTACAGACXZA 
ATCAAAAGTG 
CTCTCAAOCr 
T7TTAAAAAA 
TTATATCTGT 



41 

I 

agcctgcggc 
ctcctccgcc 

GCOGCCTCCC 
CXX3T0GAGCG 
GTGGTGGACC 
GGGM3CCC7G 
AAAGGAGAAA 
AAGCAGTGTT 
ACATTTACAA 
CTAAAATGCA 
TCAGGACCTC 
TCAACAATTA 
G6ATTAGTGG 
TCTACTGGAT 
TAATTTTCAT 
TTTAAATAAG 
AAGTGTGATT 
GTTTCAATAT 
ATAATTTGGA 
AXATAAAASC 
TAAATAAAAA 



51 

I 

GGCCTCGGAG 
TCCAGCTGOQ 
G6CA6066CT 
CCrClGAGAT 
TGTATAATGG 
GGGCCAATGG 
AGGGGGAATG 
CATG6A6TTC 
AGATGCGTTC 
GAAATGCATG 
TTCCCATTGA 
ATATTCATCG 
ATGTTGCTAT 
G6AATTCAGT 
TTGCTACCTC 
TTTATGTAXA 
TCACACTGTT 
TTTTTTTAGT 
ATATTGTTGT 
TACCAATCXT 
TTATTTCCAA 



Seq ID 130: 413 Protein sequence 
Protein Accession |t XP_057014 



11 



21 



31 



41 

1 



51 



KRPQGPAASP QRLR6LLLLL LLQZiPAPSSA SEIPKGKQKA QLBQREVVDL YN04CLQ6PA 
GVPGRDGSPG ANGIPGTP6X PQRD6FXGSK GSOAESFEB SHTPNyKQCS HSSIilYGZDL 
GKZAECTFTK MRSNSALRVL FSGSLRLKCB KACOQRWm FBGABCS6PL PZEAZIYZiDQ 
GSPEaWSTZM ZHRTSSVB6L CBOZGAOLVD VAIHVGTCSD YPKGDASTGH NSVSRIZXBB 
LPK 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
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51 
I 

CTCCAGTCTC 
TCtAAGTGGC 
TAGTA ATCAA 
ATTTTGTCCA 
GAATCCAGAA 
AAGATCTCCT 
ACAGAOGCTG 
TTAGTTT6CA 
CCTGTAGGAA 
TATAATGTAA 
TATTTOCCTC 
ATCAGAATTC 
AATGCTCTTT 
TGGCTACCTA 
TGTAAGTATT 
CCTATATTGT 
AACAGGAAAA 
T0CTGAATG6 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



1 11 21 31 41 51 

1 1 I I 1 I 

MNQTAILZCC LZFLTLSGIQ GVPl^SRTVRC TCISISNQFV NPRSLEKLEI ZPASQFCPRV 
EZZAIMKKRS EKROdlPBSK AZKKUiKAVS KEHSKRSP 

Seq ID KO: 412 DNA sequence 
Nucleic Acid Accession XM_0S7014 
Coding sequence: 143.. 874 " 



60 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 



60 
120 
180 
240 
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Seq ID NOt 414 DMA sequence 

Nucleic Acid Accession S: XM_oe4007 

Ooding sequence: 138.. 2405 " 

1 11 21 31 41 51 

I i ) I i 1 

CTOSTGCOGA ATTCGGCAOG AGACCGCGTG TTOGCGCCTQ GTAGAGATTT CTCGAAGACA 60 

OCSWJTOGGCC OGTGTQGAAC CAAACCTGCG OOOGTGGCCG GGCOGTOGGA CAAC3GAGG0C 120 

GC6GAGA0GA AGGCX3CAATG GOGAGGAAGT TATCTGTAAT CTTGATCCTG ACCTTTSCCC 180 

TCTCXGTCAC AAATCCCCTT CATGAACTAA AAGCAGCTGC TTTCCOCCAG ACCACTGAGA 240 

AAATTAGTCC GAATTGGGAA TCTGGCATTA ATGTTGACTT GGCAATTTCC ACA0S6CAAT 300 

ATCATCTACA AGAGCTTTTC TACOSCTATG QAGAAAATAA VIXJAT T GTC A GTTGAAeGGT 360 

TCAGAAAATT ACTTCAAAAT ATAG6CATAG ATAAGATTAA AAGAATCCAT ATACAOCATG 420 

ACCAOGACCA TCACTCAGAC C31CXSACCATC ACTCAGACCA TGAGC3GTCAC TCAGACCATG 480 

AGCATCACTC AGACCAOGAG CATCACTCTG ACCATGATCA TCACTCCCAC CATAATCATG 540 

CrOCTTCSGG TAAAAATAAG OGAAAAGCTC TTT60CCAGA OCATGACTCA GATAGTTCAG 600 

GTAAAGATCC TAGAAACAGC CAGGGGAAAG GAGCTCACOQ ACCA6AACAT GCCAGTGGTA 660 

GAA6GAAT6T CAAGGACAGT GTTAGT6CTA GT6AAGTGAC CTCAACTGTG TACAACACTG 720 

TCTCTGAAGG AACTCACTTT CTAGAQACAA TAGAGACTOC AAGAOCTGGA AAACTCTTCC 780 

GCAAAGATGT AA6CA6CTCC ACTCCACCCA GTGTCACATC AAAGAGCOGG GTGAGCCGGC 840 

TGOCTOOTAg 6AAAACAAAT GAATCTGTGA GT6AGCCC03 AAAAGGCTTT AT6TATT0CA 900 

QAAACkCAAA TGAAAAtCCT CftOGA G T G TT TCAAT6CATC AAAGCTACIG ACATCTCATG 960 

GCATGGGCAT CCAGG T TCaS CTGAATGCAA CAGAGTTCAA CTATCrCTGT CCAOCCATCA 1020 

TCAACCAAAT TGATGCTAGA TCTTGTCTGA TTCATACAAG T6AAAAGAAG GCTGAAATCC 1080 

CrCCAAAGAC CIATTCATTA CAAATAGCCT GGQTTGGTGG TTTTATAGCC ATTTCCATCA 1140 

TCAGTTTCCT OTCTCTGCTO GGGOTTATCT TAGTGCCTCT CMGAATOGG GTGT r i TTCA 1200 

AATXTCTCCT aAGTTTOCTT GTOGCACTGG COGTTGGGAC TTTGRGTGQT GATGCTTTTT 1260 

TACACCTTCT- TCCAC3VTTCT CATGCAAGTC ACCACCATAG TCATAGCCAT GAAGAAOCAG 1320 

CAATGGAAAT GAAAAGAGGA CCACTTTTCA GTCATCTGTC TTCTCAAAAC ATAGAAGAAA 1380 

GTGCCTATTT TGATTCCAOG TGGAAGGGTC TAACA6CTCT A6QAGGCCTG TATTTCATGT 1440 

TTCTTGTTGA ACATGTCCTC ACATTGATCA AACAATTTAA AGATAAGAAG AAAAAGAATC 1500 

AGAA6AAACC T6AAAATGAT GATGATGTGO AQATTAA6AA GCAGTTGTCC AA6TATGAAT 1560 

CTCAACTTTC AACAAATGAG GAGAAAGTAG ATACAGATGA TOGAACTGAA GGCTATTTAC 1620 

GAGCAGACTC ACAAGAGCCC TCCCACTTTG ATTCTCAGCA GCCTOCAOTC TTGQAAGAAG 1680 

AAGAOGTCAT GATAGCTCAT GCTCATCCAC AGGAA6TCTA CAAT6AATAT GTACOCAGAG 1740 

G8TGCM6AA TAAATGCCAT TCACATTTCC AOQATACACT 08GCCAGTCA 6A0GATCTCA 1800 

TTCACCACCA 7CATGACTAC CATCATATTC TGCATCATCA CCACCACCAA AACCAOCATC 1860 

CTCACRGTCA CAGCCAGCGC TACTCTCXMG AGGAGCT6AA AGATGCOGGC GT(»CCACTT 1920 

TGGCCTGGAT GCTGATAATG GGTGATGGCC TGCACAATTT CAGOGATGGC CTAGCAATTG 1980 

G m c m LTiT TACTGAAGGC TTATCAAGTG GTTTAAGTAC TTCTGTTGCT GTGTT CTGTC 2040 

ATGAGTT6CC TCATGAATTA GGTGACTTTO CTOTT C TACT AAAOGCTGGC ATGACOGTTA 2100 

AGCAGGCTGT GCTTTATAAT GCATTGTCAO 0CAT6CTG6C GTATCTTGGA AT6GCAACAG 2160 

GAATTTTCAT TGGTCATTAT GCTGAAAATG TTTCTATGTG GATATTTGCA CTTACTGCTG 2220 

GCTTATTCAT GTATGTTGCT CTGGTTGATA TGCTACXTTGA AATGCTGCAC AAT GATGCT A 2280 

GTGACCATGG ATGTAGCOGC TGGGGGTATT TCTTTTTACA 6AATGCTG6G ATGCTTTTGG 2340 

GTTTTG6AAT TATGTTACTT ATTTCCATAT TTGAACAXAA AAT06TGTTT OGTATAAATT 2400 

TCTA6TTAAG GTTTAAATGC TAGAGTAOCT TAAAAAGTTG TCATAGTTTC AGTAGGTCAT 2460 

AGGQAGATGA GTTTGTATGC TGTACTATGC AGCGTTTAAA 6TTAGTX3G6T TTTGTGATTT 2520 

TTGTATTGAA T A TTGCTGTC TGTTACAAAG TCAGTTAAAG 6TACXSTTTTA ATATTTAAGT 2580 

TATTCTATCT TGGAGATAAA ATCTGTAT6T GCAATTCACC GGTATTACCA GTTTATTATG 2640 

TAAACAA6A8 ATTTG6CATG ACATGTTCIG TATGTTTCAa GGAAAAATGT CTTTAATGCT 2700 

TTTTCAAGAA CTAACACAGT TATTCCTATA CT66ATTTTA GeTCTCTGAA GAACTGCTGG 2760 

TGTTTAGGAA TAAGAATGTO CATGAAGCCT AAAATACCAA GAAAGCTTAT ACTGAATTTA 2820 

AGCAAAGAAA TAAAGGAGAA AAGAGAAGAA TCTGAGAATT GGGGAGGCAT AGATTCTTAT 2880 

AAAAATCACA AAATTTGTTG TAAATTAGAG GGGAGAAATT TAGAATTAAG TATAAAAAGG 2940 

CAGAATTAGT ATAGAGTACA TTCATTAAAC ATTTTTOTCA GGATTATTTC COGTAAAAAC 3000 

GTAGTGAGCA CrCTCATATA CTAATTAGTG TACATTTAAC TTTQTATAAT ACAGAAATCT 3060 

AAATATATTT AATGAATTCA AGCAATATAC ACTTGACCAA GAAATTGGAA TTTCAAAATG 3120 

TTCGTGOGGQ TTATATACCA GATGAGTACA GTGAGTAGTT TATGTATCAC CAGACTGGGT 3180 

TATTGCCAAO TTATATATCA CCAAAAGCTG TATGACTGGA 'ltfn'Clx: ;G TT ACCTGGTTTA 3240 

CAAAATTATC AGAGTAGTAA AACTTTGATA TATATGAGGA TATTAAAACT ACACTAAGTA 3300 

TCATTTGATT OGATTCAGAA AGTACTTTGA TATCTCTCAG TGCTTCAGTG CTATCATTGT 3360 

GA6CAATTGT CTTTATATAC GGTACTGTAO CCATACTAGG OC W XCTOT G GCATrCTCTA 3420 
GATOTTTCTT TTTTACACAA TAAATTCCTT ATATCA6CTT G 

Seq n> NO: 415 Protein sequence 
Protein Accession fit XP_084007 

1 11 21 31 41 51 

I I I 1 I I 

MARKLSVILI LT7ALSVTNP LHELKAAAFP QTTEKISPKW BSGINVDLAI STRQYHLQQL 60 

FYRYGENNSL 8VBGFRKLLQ NIGIDKIKRI BXHHDHDHHS DHEKHSDHER BSDHSHHSDH 120 

EKHSDHDRBS HBNHAASGKN KRKAliCPDED SDSSGKDPRN SQC^OGAHRPE HASGRRNVKD 180 

SVSASEVTST VYUTVSBGTH FLETIBTPRP GKLPPKDVSS STPPSVTSKS BVSRLAGRKT 240 

KESVSEPRKG PMYSRHTSEN PQECPNASKL LTSHOIGIQV PLNATEFNYL CPAIIKQIDA 300 

RSCLIHTSEK KAEIPPKTYS LQIAWVGGFI AISIISPLSL LGVILVPUflN RVFFKFIiLSP 360 

I«VALAVGTLS GDAFIiHLI*PH SHASHBBSHS HEEPAMENKR GPLFSHLSSQ NIEESAYFDS 420 

TIfKGLTALGG XiYFMFLVEKV LTLXKQFKDK KKKNQKKPEN DDDVEIKKQL SKYESOLSTN 480 

EEKVDTDDRT BGYZAADSQB PSHFDSQQPA VLEEEEVMIA HAHPQEVnCHE YVPRGCKHKC 540 

HSHFHDTL6Q SDDLIHHBHD YHHILBHEHH QKHHPSSHSQ RYSR5ELKDA GVATLANMVI 600 

MQD6LHNFSD GLAIGAAFTE GLSSGLST5V AVFCHELPBB LGDFAVLLKA 04TVKQAVLY 660 

KALSAMLAVIi Q4ATGZFZ6H YAEtJVSMHIF ALTAGIiFMYV ALVI3KVPB4L HNDASDHGCS 720 
RHGypFLQKA GMZiLGFGIHL LISZFEBKIV FRINP 

Seq ID MfO: 416 DNA Sequence 

Nucleic Acid Accession #: NN_015419.1 

Coding sequence: 1..8487 
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I I I I I i 

ATGCCCAAGC GCX5CGCACTG GGGGGCCCTC TCCGTGGTGC TGATCCTGCT TTGGGGCCAT 60 

5 OCGOGAGTGG OGCTGGCCTO CCCGCATCCT TGTGCCTGCT AOSTCCCCAG OGAGGTCCAC 120 

TGCACGTTCC GATCCCTGGC TTCOGTGCCC GCTGGCATTQ CTAOACAOGT OQAAAGAAIC 180 

AATTTGGGGT TTAATAGCAT ACA6600CT6 TCAGAAAOCT CATTTGCAGG ACTGACCAAG 240 

TTGGAGCTAC TTATGATTCA CGGCAATGAG ATCCCRAGCS^ TCCCOGATQG AfiCTTTAWa 300 

GACCTCAGCT CTCTTCAGGT TTTCAAGTTC AGCTAC»AC3^ AGCTGAGAGT GATCACAGGA 360 

10 CAGAOCCTCC AGGGTCTCTC TAACTTAATG AGGCTGCACA TTGACCACAA CAAGATOGAG 420 

TTTATCCACC CTCAAGCTTT CAAOGGCTTA ACGTCTCTGA GGCT ACTQCA TTTGG AAGGA 460 

AATCTCCTCC ACCAGCTGCA CCCCAGCACC TTCTCGAOST TCACATTTTT GGATTATTTC 540 

AOACTCTCCA CCATAAGGCA OCTCTACTTA GCAGAGAACA TGGTTAGAAC TCTTCCTGCC 600 

AOCATGCTTC GGAACATGCC CCTTCTGGAG AATCTTTACT TGCAGGGAAA TCCX5TGGACC 660 

15 TGCGATTGTG AGAT6AGATG GTTTTTGGAA TGGGATGCAA AATCCAGAGG AATTCTGAAG 720 

TGTAAAAAGG ACAAAGCTTA TGAAGGOGGT CAfiTTGTGTG CAATGTGCTT CAGTCCAAAG 780 

AAGTTGTACA AACAT6AGAT ACACAAOCTO AAGGACATGA CTTGTCTGAA GCCTTCAATA 840 

GAGTOCCCTC TGAGACAGAA CAGGA6CAGG AGTATTGAGG AGGAGCAAGA ACAGGAAGAG 900 

OATGGTGOCA GCCA6CTCAT CCTGGAfiAAA TTCCAACTGC CCCAGTGGAG CATCTCTTTG 960 

20 AATATGAC06 ACGAGCAOGG GAACATGGTG AACTTGOTCT GTGACATCAA GAAACCAAT6 1020 

GATGTGTACA AGATTCACTT GAACCAAAOG GATCCTCCAG ATATTGACAT AAATGCAACA 1080 

6TTGCCTTGG ACTTTGAGTG TCCAATGACC OGAGAAAACT ATGAAAAGCT ATGGAAATTQ 1140 

ATA6CATACT ACAGTGAAGT TCC0GT6AAG CTACACAGAG AGCTC ATOC T CAGCAAAGAC 1200 

CCCAGAGTCA GCTACCAGTA CAGGCA6GAT 6CTGATQAG6 AAGCTCrTTA CTACACAGGT 1260 

25 6TGAGA6CCC A6ATTCTTGC AGAACCAGAA TGQGTCATGC AGCCATCCAT AGATATCCAG 1320 

CTGAACC6AC GTCAGAGTAC GGCCAAGAAG GTGCTACTTT CXTTACTACAC CCAGTATTCT 1380 

CAAACAATAT CCACCAAAGA TACAAGGCA6 GCTCGGGGCA GAAGCTGGGT AAT GATT GAG 1440 

CCTAGTGGAG CTGTGCAAAG AGATCAGACt GTCCTGGAAG 6GG6TCCATG CCAGTTGAGC 1500 

TGC3U10GTGA AA6CTTCTGA 6AGTCCATCT ATCTTCT6G6 TGCTTCCAQA TGGCTCCATC 1560 

30 CT6AAAG0GC CCATGGATGA CCCAGACAGC AAGTTCTCCA TTCTCAGCAG TGGCTGGCTG 1620 

AGGATCAAGT CCATGGAGCC ATCTGACTCA GGCTTGTACC AGTGCATTGC TCAAGTGAGG 1680 

6ATGAAATGG ACOGCATGGT ATATAGGGTA CTTGTGCAGT CTCCCTCCAC TC AGCC AGCC 1740 

GAGAAAGACA CAGTGACAAT TG6CAAGAAC CCAGGGGAGT 06GTGACA7T GCCTTGCAAT 1800 

GCTTTAGCAA TACCOQAAGC OCACCTTAGC TGGATTCTTC CAAACAGAAG GATAATTAAT 1860 

35 GATTTGGCTA ACACATCACA TGTATACATG TTGCCAAATG GAACTCrTTC CATCCCAAAG 1920 

GTCCAAGTCA GTGATAGTGG TTACTACAGA TGTGTGGCTQ TCAACCAGCA AGGGGCAGAC 1980 

CATTTTACGG TGGGAATCAC ACTGACCAAG AAAGGGTCTG GCITGCCATC CAAAAGAGGC 2040 

AGA080CCAG GTGCAAAGOC TCTTTCCAQA GTCAQA6AAG ACAT0GTG6A G6KIGAAGG6 2100 

GGCrCGGGCA TGGGAGATGA A6A6AACACT TCAAGGAGAC TTCTGCATCC AAAOGACCAA 2160 

40 GAGGTOTTCC TCAAAACAAA GGAT6ATGCC ATCAATGGAG ACAAGAAAGC CAAGAAAGGG 2220 

AGAAGAAAGC TGAAACTCTG GAAGCATTOG GAAAAAGAAC CAGAGACCAA TGTTGCAGAA 2280 

6GTOGCAGAG T6TTTGAATC TAGACGAAGO ATAAACATGG GAAACAAACA GATTAATCOG 2340 

GAGOGCT GGG CTGATATTTT AGCCAAAOTC CSTGGGAAAA ATCTCCCTAA GGGGACAGAA 2400 

GTACCOCOGT TQATTAAAAC CACAAOTCCT CCATCCTTGA GCCTAGAAOT CACACCACCT 2460 

• 45 TTTCCTGCTG TTTCTCCCCC CTCAGCATCT CCTGTGCAGA CAGTAACCAG TGCTQAAGAA 2520 

TCCTCAGCAQ ATGTACCTCT ACTTGGTGAA GAAGAGCAOG TTTTGGGTAC CATTTCCTCA 2580 

GCCA6GATG0 GGCTAGAACA CAACCACAAT GGAGTTATTC TTGTTGAACC TGAAGTAACA 2640 

AGCACACCTC TGGAGGAAGT TOTTGATQAC CTTTCTGAGA AGACTGAGGA GATAACTTCC 2700 

ACTGAAOGAG A0CTGAAG6G GACAGCAQCC CCTACACTTA TATCTGAGCC TTATGAACCA 2760 

50 TCTCCTACTC TGCACACATT AGACACAGTC TATGAAAAGC CCACCCATGA AGAGACGGCA 2820 

ACAGAGGGTT GGTCTGCA6C AGATGTTGGA TCGTCACCAG AGCCCACATC CAGTGAGTAT 2680 

GAGCCTCCAT TGGATGCTGT CTCCTTGGCT GACTCTGAGC CCATGCAATA CTTTGACCCA 2940 

aATTTOGAOA CTAAGTCACA ACCAGATGAG GATAAGATOA AAQAA6ACAC CTTTGCACAC 3000 

CTTACTCCAA COOOCACCAT CTGGGTTAAT GACTCCAGTA CATCACAGTT ATTTGAGGAT 3060 

55 TCTACTATAG GGQAACCAOO TGTCCCAGOC CAATCACATC TACAAGGACT GACAGACAAC 3120 

ATCCACCTTG TGAAAAGTAG TCTAAGCACT CAAGACACCT TACTGATTAA AAAGOGTATO 3180 

AAAGAGATGT CTCAGACACT ACAGGGAGGA AATATGCTAG AGGGAGACCC CACACACTCC 3240 

AGAAGTTCTG A6AGTGAGGG CCAAGAGA6C AAATCCATCA CTTTGCCTGA CTCCACACTG 3300 

GGTATAATGA GCAGTATGTC TCCAGTTAAG AAGCCTOOGG AAACCACAGT TGGTACCCTC 3360 

60 CTAGACAAAG ACACCACAAC AGTAACAACA ACACCAAGGC AAAAAGTTGC TCCGTCATCC 3420 

ACCATGAGCA CTCACCCTTC TCGAAGGAGA CCCAACGGGA GAAGGAGATT ACGCC CCAAC 3480 

AAATTCCGCC ACCQGCACAA GCAAACCCCA CCCACAACTT TTGCCCCATC AGAGACTTTT 3540 

TCTACTCAAC CAACTCAAGC ACCTGACATT AAGATTTGAA GTCAAOTQGA GAGTTCTCT6 3600 

GTTCCTACAG CTTGGGTGGA TAACACAGTT AATACCCCCA AACAGTTGGA AAT66A6AAG 3660 

65 AATGCAGAAC CCACATCCAA GGGAACACCA CQGAGAAAAC ACGGGAAGAG GCCAAACAAA 3720 

CATCGATATA CCCCTTCTAC AGTGAGCTCA AGAGCX3TCCG GATCCAAGCC CAGC CCTTCT 3780 

CCAGAAAATA AACATAGAAA CATTGTTACT CCCAGTTCAG AAACTATACT TTT6CCTAGA 3840 

ACTGTTTCTC TGAAAACTGA GGGCCCTTAT GATTCCTTAG ATTACATGAC AACCACCAGA 3900 

AAAATATATT CATCTTACCC TAAAGTCCAA GAGACACTTC CAGTCACATA TAAACCCACA 3960 

70 TCA6ATG6AA AA6AAATTAA GGATGATGTT GCCACAAATG TTGACAAACA TAAA AGTGAC 4020 

ATTTTAGTCA CTGGTGAATC AATTACTAAT GCCATACCAA CTTCTOXrrC CrTGGTCTOC 4080 

ACTATGG6AG AATTTAAGGA AGAATCCTCT CCTGTAGGCT TTCCAGGAAC TCCAA CCTGO 4140 

AATCCCTCAA GGAOGGCCCA GCCTGGGAGG CTACAGACAG ACATAOCTGT TACCR CTTC T 4200 

GGGGAAAATC TTACAGACCC TCCCCTTCTT AAAGAGCTTG AGQATOTOGA TTTCACTTOC 4260 

75 GAGTTTTTST CCTCTTTGAC AGTCTCCACA CSCATTTCAOC AGGAAGAAGC TGGTTCTTCC 4320 

ACAACTCTCT CAAGCATAAA AGTGGAGGTO GCTTCAAGTC AGGCAGAAAC CACCACCCTT 4380 

GATCAAGATC ATCTTQAAAC CACTGTOGCT ATTCTCCTTT CTGAAACTAG ACCACAGAAT 4440 

CACACCOCTA CTGCTGCCCG 6ATGAAGGAG CCAGCATCCT CGTCCOCATC CACAATTCTC 4500 

ATGTCTTTGG GACAAACCAC CACCACTAAG CCAGCACTTC CCAGTCCAAG AATATCTCAA 4560 

80 GCATCXAjQAO ATTCCAAGGA AAATGTTTTC TTGAATTATG TGGG6AATCC AGAAACAGAA 4620 

GCAAOCOCAG TCKACAATGA AGGAACACAG CATATGTCAQ GGCCAAAT6A ATTATCAACA 4680 

CCCTCTTCOG ACOGGGATGC ATTTAACTTO TCTACAAAGC TGGAATTGGA AAAGCAAGTA 4740 

TTTGGTAGTA GGAGTCTACC AOGTGGCCCA GATAGCCAAC GCCAGGATGQ AAGAGTTCAT 4800 

GCTTCTCATC AACTAACCAG AGTCCCTGOC AAACCCATOC TACCAACAGC AACAGT6AG0 4S60 

85 CIAOCXGAAA TGTOCACACA AAOOGCTTOC AGATACTTTG TAACTTCCCA GTCAOCTCGT 4920 

CACTOGACCA ACAAACOGGA AATAACTACA TATCCTTCTG G6GCTTTGCC ACSIGAACAAA 4980 

CAQITIACAA CTOCAAGATT ATCAAGTAOl ACAATTCCTC TC0CATT6CA CATGTOCAAA 5040 
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O0CA6CATTC CXAGTAAGTT TACTGACC3GA AGAACIGAGC AATTQUVTGG TIACTCaUUl 5100 

GTGTTTG6AA ATAACAACJVT GCCTGAGGCA AGAAAOCXAG TTGGAAAGCC TCCCAGTCOV 5160 

AGAATTCCTC ATTATTCCAA T6GAAGACTC CCTTTCTTTA CCAACAAGAC TCTTTCTTTT 5220 

0CACAGTT6G GAGTCACCOG GAGACCCCAG ATACCCACTT CTCCTGCCCC AGTAATGAGA 5280 

GAGAGAAAAG TTATTCCAGG TTCCTACAAC AG6ATACATT CCCATAGCAC C7TCCATCT6 5240 

GACTTTG6CC CTGOGGOUX: TC OGTWnU CACACTCCX3C AGAOCACX3G6 ATCACOCTCA 5400 

ACTAACTTAC AGAATATCCX: TATGGTCTCT TCCACOCAGA GTTCTATCTC CTTTATAACA 5460 

TCTTCrGTCC AGTCCTCRGG AAGCTTCCAC CAGAGCAGCT CAAAGTTCTT TGCAGGAGGA 5520 

CCTCCTGCAT CCAAATTCTG GTCTCTTGGG GAAAAGCCCC AAATCCTCAC CAAGTCCCCA 5580 

CAGACTG7GT COGTCACOGC TGAGACAGAC ACTGTGTTGC CCTGTGAGGC AACAGGAAAA 5640 

OCAAASCCTT T08TTACTTG GACftAAGGTT TCCACAG8AG CTCTTATGAC TCOSAATAGC 5700 

ASGATACAAC GGTTTGAG6T TCTCAAGAAC GGTACCrTAG T6ATA06QAA G6TTCAAGTA 5760 

CAAGATOGAG GCCAGTATAT GTGC3WXGCC AGCAACCTGC ACGGCCTGGA CAGGATGGTG S820 

GTCTTGCTTT OGGTCACCGT GCAGCAACXT CAAATCCTAG CCTCCCACTA CCAGGA03TC 5880 

ACTGTCTACC TGGGAGACAC CATTGCAAT6 GAGTGTCTOG CCAAAOGGAC CCCA600CCC 5940 

CAAATTTOCT GGATCTTOOC TGACAG6AGG GTOTOGCAAA CTGTOTGCOC OQTGGACSIGC 6000 

OGCATCACOC T6CA0GAAAA CCGGAOCCTT TOCATCAAGG AG60GTCCTT CTCAGAOUSA 6060 

GGOGTCTATA AGTGCX5TGGC CA6C3U^TGCA GOC3GGGGCGG ACAGCCTGGC CATCCGCCTG 6120 

CACXS T GGCGQ CACTGCCCCC CGTTATCCAC CAGGAGAAGC TX3GAGAACAT CTCGCTGCCC 6180 

CCGGGGCTCA GCATTCACAT TCACTGCACT GCCAAGGCTG CGCCCCTGCC CAGOGTGOGC 6240 

TGGGTGCTCG GGGACX3GTAC CCA6ATC0GC CCCTCGCAGT TCCTCCAOGG GAACTTGTrT 6300 

GTTTTCCCCA ACGGGACGCT CTACATCCGC AACCTCGCGC CCAAGGACAG CGGGOGCTAT 6360 

GAGTGCGTG6 COSCCAACXrr GGTAGGCTCC 6080GCAGGA OGGTGCAGCT GAACGTGCAG 6420 

OGTGCAGCAG CCAA060GOQ CATCAO0G6C ACCTCCC06C GGAGGAOGGA CGTCAGGTAC 6480 

OGAGGAACCC TCAAGCT6GA CTGCAG06CC TOGGGGGACC CCTGGCOGCG CATCCTCTGG 6540 

AGGCTGCCGT CCAAGAGGAT GATCGACGCG CTCTTCAGTT TTGATAGCAG AATCAAGGTG 6600 

TTTGCCAAT6 GGACCCTGGT GGTGAAATCA GTGAOGGACA AAGATGCOGG AGATTACCTG 6660 

TGOGTAGCTC GAAATAAGGT TGGTGAT6AC TAOGTGGTGC TCAAAGIOGA TGTGGTGA7G 6720 

AAACCXSGCXa AGATTGAACA CAAGGAGGAG AAOGAOCACA AAGTCTTCTA C3GGGGGTGAC 6780 

CTGAAAGTGG ACTGTGTOQC CACCGGGCTT CCX3UVTCCCG AGATCTCCTG GAGCCTCCCA 6840 

GAOGGGAGTC TGGTGAACTC CTTCATGCAG TCGGATGACA GCGGTGGACG CACCAAGCX3C 6900 

TATGTOGTCT TCAACAATGG GACACTCTAC TTTAAOGAAG TGGGGATGAG GGAQGAAGGA 6960 

GACTACACCT GCT T T G C T GA AAATCAGGTC 6GGAAGGA0G AGATGAGAGT CAGAGTCAAG 7020 

6TGGTGACA0 GGCCCGCCAC CATOCXSGAAC AAGACTTACT TGGCGGTTCA GGTGCOCTAT 7080 

GQAGACGTGG TCACTGTAGC CTGTGAGGCC AAAG6AGAAC CCATGCCCAA GGTGACTTGG 7140 

TTGTCCCCAA CCAACAAGGT GATCCXXACC TCCTCTGAGA AGTATCAGAT ATACCAAGAT 7200 

GGCACTCTCC TTATTCAGAA AGCCCAGCGT TCTGACAGCG GCAACTACAC CTGCCTG6TC 7260 

AGGAACAG03 GGGGAGAG6A TAGGAA6A06 6T6T6GATTC AOGTCAAOGT CCAOCCAOCC 7320 

AASATCAAOG GTAACOCCAA CCCCATCACC ACOOTGOGGO AGATAGGAGC 0G6QGGCAGT 7380 

CGGAAACTGA TTGACTGCAA AGCTGAAGGC ATCCCCACCC CGAGGGTGTT ATGGGCTTTT 7440 

CCOGAGGGTO TOaTTCTGCC AGCTCCATAC TATGGAAACC GGATCACTGT CXATGGOUVC 7500 

GGTTCCCTGG ACATCAGGAG TTTGAOGAAG AG0GACTCO6 TCX316CTGGT AT6CAT6GCA 7560 

CX5CAA0GAGG QAGG6GAGGC GA GG TTGATC GTQCAGCTCA CTOTCCTGGA OCCGATOaAO 7620 

AAACCCATCT TCCAC6ACCC GATCAGOGAG AAOATCAOGG CCATGGOGGG CCACACCATC 7680 

AGCCTC3U«rr GCTCTGOOGC GGGGACCCOG ACACCCAGCC TGGTGTGGGT CCTTCCCAAT 7740 

GGCACOSATC TGCAGAGTGG ACAQCAGCTG CAGCGCTTCT ACCACAAGGC TGACGGCATG 7800 

CTACACATTA GCGGTCTCTC CTCGGT6GAC GCIGGGGCCT ACOGCTGCXTT GGCCCGCAAT 7860 

GCCGCTGGCC ACA06GAGAG 6CTGGTCTCC CTSAAGGIGG GACTQAAGCC AQAAGCAAAC 7920 

AAGCAGTATC ATAACCTGGT CS^BOITCATC AATGGTOAGA GOCTGAAGCT CCCCTGCACC 7980 

CCTCCC6GGG CTGGGCAGGG A06TTTCTCC TGQAOGCTCC CCAATGGCAT GCATCTGGAG 8040 

GGCCCCCAAA CCCTGGQAOQ CGTTTCTCTT CTGGACAATG GCACCCTCAC GGTTOGTGAG 8100 

GCCTCGGTGT TTGAC3U3GOG TACCTATGTA TGCAGGATGG AGAOGGAGTA CGGC0CTTCX3 8160 

GTGACCAGCA TCCCCGTGAT TGTGATOGCC TATOCTCCCC G6ATCA0CA0 06AG00CA0C 8220 

C0G6TCATCT ACACC0G6CC 06GGAACACC 6TQAAACTGA ACTGCATGGC TATGGGQATT 8260 

CCCAAAGCTG ACATCAOGTG GGAGTTACCG GATAAGTCGC ATCTGAAGGC AGGGGTTCAG 8340 

GCTCGTCTGT ATGGAAACAG ATTTCTTCAC CCOCAGGGAT CACTGACCAT CCAGC31TGCC 8400 

ACACAGAGAa ATGCOGOCTT CTACAAGTGC ATGGCAAAAA ACATTCTOGG CAGTGACTCC 8460 

AAAACAACTT ACATCCAOOT CTTCTGAAAT 6TGGATTCCA GAATGATTGC TTAGGAACTG 8520 

ACAACAAAGC (JGUG ' lTi ' G T A A6GGAA6CCA GGTT GG G G AA TAGGAGCTCT TAAAXAATGT 8580 

GTCACAGTQC ATGGTGGCCT CTGGTGGGTT TaUGTTGAO GTTQATCTTG ATCTACAATT 8640 

GTT6GGAAAA GGAAGCAATG CAGACACGAG AAOGAGGGCT CAOCCTTGCT GAGACACTTT 8700 

crr nxyi -GT r tacatcatgc cagoggcttc attcagggtg tctgtgctct gactgcaatt 8760 

' IVi ' Cri 'C m ' TOCAAATGCC ACT0SACT6C CTTCATAA6C GTOCATAGGA TATCT6AGGA 8820 

ACATTCATCA AAAATAAGOC ATAGACATGA ACAACACCTC ACTACCCCAT TGAA6AGQCA 6880 

TCACCTAGTT AACCTGCTGC AGTTTTTACA TGATAGACTT TGTTCCAGAT TGACAAGTCA 8940 

TCTTTCAGTT ATTTCCTCTG TCACTTCAAA ACTCCAGCTT GCCCAATAAQ GATTTAGAAC 9000 

CAGAGTGACT OATATATATA TATATATTTT AATTCAGAGT TACATACATA CA GCTAC XaT 9060 

TTTATATGAA AAAAGAAAAA CATTTCTTCC TGGAACTCAC TTTTTATATA ATGTTTTATA 9120 

TATATATTTT TTCCTTTCAA ATCAGACX5AT GAGACTAGAA GGA6AAATAC TTTCTGTCTT 9180 

ATTAAAATTA ATAAATTATT GGTCTTTACA AGACTTGGAT ACATTACAGC AQACATGGAA 9240 

ATATAATTTT AAAAAATTTC TCTCCAACXT CCTTCAAATT CAGTCACCAC TGTTATATTA 9300 

CCTTCTCCAG GAACCCTGCA GTGGGGAAGG CTGGGATATT AGATTTCXnT 6TATGCAAAG 9360 

TTTTT6TTGA AAOCTGTGCT CASAGGAGGT GAGAGGA6A6 GAAGGAGAAA ACTGCATCAT 9420 

AACTTTACAO AATT6AATCT AGAGTCTTCX: C0GAAAA6CC CAGAAACTTC TC TGCAG TAT 9480 

CTG6CTTGTC CATCTGGTCT AAGGTGGCTG CTTCTTCCCC AGCCAT6AGT CAGTTTGT6C 9540 

CCATGAATAA TACACGACCT GTTATTTCCA TGACTGCTTT ACTGTATTTT TAAGOTOUVT 9600 
ATACTGTACA TTTGATAATA AAATAATATT CTCCCAAAAA AAAAA 

Seq ID NO: 417 Protein sequence 
Protein Accession NP_056234.1 

1 11 21 31 41 51 

I I I I I t 

MPKBAHHGAL SWLZLUIGH PRVALACFHP CACYVPSEVH CTFRSLASVP A6IARRVERI 60 

NLGFNSIQAL SBTSFAGLTR LBLLMIBGHE IPSIPDGALR DLSSLOVPKF SYNKLRVZTG 120 

QTLQGLSmM RLHIDHNKIE FIHPQAFNGL TSLRLLHLEG HUaQLHPST PSTPTFU3YP 180 

RLSTZRHIiYL AENNVRTLPA SKLRNMPLLB NLYLQGUPNT CDCQ1RWFLB HDAKSRGXLK 240 
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CKKDKAyEGQ QLCAMCPSPR KLYRBEIBKL KDKTOiKPSX ESS1AQSR8R SIEEBQEQBB 300 

DGGSQliILBK FQLPQWSISL KMTDracaiMV NLVCDXKKFM DVYKIEXdlQT DPPDZDZKAT 360 

VALDFBCPMT RaJYEKLWKL lAYYSEVPVK LHRBUILSKD PRVSYQYRQD ADEEALYYTG 420 

VRAQILAEPE WVMQPSIDIQ LNRRQSTAKK VUiSYYTQYS QTISTKDTRQ ARGRSWVMIB 480 

PSGUVQRDQT VLBGGPOQLS Q7VKASESP5 ZFWVLPOGSI LKAPMDDPDS KFSZIiSSGHL 540 

RZKSMEPSDS GLYQCZAOVR DSOSNVySV LVQSPSTQPA BKDTVTK3KH PGESVTLPQ7 600 

AIAZPEAHLS HILPNRRIZH DLAN7SEVYM LPHGTLSIPK VQVSDS6YYR CVAVNQQ6AD 660 

HFTVGITVTK KGSGLPSKRG HRPGAKALSR VREDIVEDBG GS04GDBSIT SRRLLHPKDQ 720 

EVPLKTKDDA INGDKKAiOOG RRKLKLWKHS EKEPETNVAE CSRRVFBSRRR IKMANKQINP 780 

ERWADILAKV RGKNLPKGTE VPPXiIKTTSP PSLSLEVTPP FPAVSPPSAS P VQTVT SABB 840 

SSADVPLLGE EEHVLGTISS ASMGLEKNHK GVZLVEPEVT STPLEEWDO LSEKTEEITS 900 

TEGDLKGTAA PTLISEPYEP SPTLHTLDTV YEKPTHEErA TBGWSAADVG SSPEPTSSBV 960 

EPPLDAVSLA BSEPMQYFDP DLETKSQPDB DKMKEDTFAH LTPTPTIHVN DSSTSQLPED 1020 

STIGBPGVPG QSHLQGLTDN IHLVKSSLST QPTUjIKKGM KEMSQTliQGG NMLEGDPTHS 1080 

RSSBSEX5QES KSITLPDSTL GIMSSMSPVK KPAETTVGTL LDKDTTTVTT TPRQKVAPSS 1140 

O^ISTHPSRRB PN6RRRLRPM KPRHREKQTP PTTFAPSETP STQPTQAPDX KZS§pVE5SL 1200 

VPTAHViarrv UTPKQLSffiK NAEPTSKGTP HRKHGKRPNK HRYTPSTVSS RASGSKPSPS 1260 

PENKHRNIVT PSSBTILLPR TVSLKTEGPY DSLDYMTTTR KIYSSYPKVQ ETLPVTYXPT 1320 

SDGKBIKDDV ATIJVDKHKSD ILVTGESITN AIPTSRSLVS 1KGBFKEESS PVGFPGTPTW 1380 

NPSRTAQPGR LQTDIPVTTS GQILTOPPLL KELEDVDPTS BFLSSLTVST PFHQEEAGSS 1440 

TTLSSIKVEV ASSQAETTTL DQDHLETTVA ILLSBTRPQN HTPTAARMKE PASSSPSTIb ISOO 

MSLGgTTTTK PALPSPRISQ ASRDSKE37VP UTYVQVPETE ATPVNNEGTQ HKSGPNSLST 1560 

PSSDRDAFNL STKLELBXOV FGSR8LPRGP DSQRQDGRVH ASHQLTRVPA KFIIjPTATVR 1620 

ItPEHSTQSAS RYFVTSQSPR BWTNKPBITT YPSGALPEtHC QFTTFRItSST TIPLPLBKSK 1680 

PSIPSKPTDR RTDQPNGYSK VFQQIKIPEA RNPVGKPPSP RIPHYSNGRL PFFTNKTLSP 1740 

PQLGVTHRPQ IPTSPAPVMR ERKVIPGSYN RIHSHSTFHL DFGPPAPPLL HTPQTTGSPS 1800 

TNLQNIPMVS STQSSISPIT SSVQSSGSPH QSSSKFFAGG PPASKFWSLG EKPQILTKSP 1860 

QTVSVTABTD TVPPCBATOK PKPPVTWTKV STGAUmWt RIQRFEVLKN GTLVIRKVQV 1920 

ffXROQYMCTA SNLHGLDRMV VUE«SVTVCX2P QILASHYQDV TVYUaXTZAM BCLAK6TPAP 1980 

QISWIFPDRR VHQTVSPVES RITLHENRTL SIKEASPSDR GVYKCVASNA AGADSLRIRL 2040 

HVAALPPVIH QEKLBinSIiP PGLSIHIHCT AKAAPLPSVR WVLGDGTQIR PSQELHOJIiF 2100 

VFPNGTLYIR HLAPXDSGRY ECVAANLVGS ARRTVQIJ3VQ RAAANARITG TSPRRTDVRY 2160 

GGTLKIiDCSA SGDPWPRILH RLPSKRKIDA LFSFDSRIKV FANGTLWKS VTDKDAGDYL 2220 

CVARNKVGDD YVVIiKVDWM KPAKIEHKEE NDBKVFYGGD UCVDCVATGL PNPBISHSLP 2280 

DGSLVKSFMQ SDDS66RTKR YWFNNGTLY FNEVGMREEG DYTCFAQiQV GKDEMRVRVK 2340 

WTAPATIRN KTYLAVQVPY GDWTVACEA KGEPMPKVTW LSPTNKVIPT SSEKYQIYOD 2400 

GTLLIQKAQR SDSOTYTCLV RHSAGEDRKT VHIHVNVQPP KINGNFNPZT TVREZAAGGS 2460 

RKLZDOCASG ZPTPRVZiHAF PE6VVLPAPY Y6NRZTVBGN G5U3ZRSXiRK SDSVQLVCNA 2520 

RNGG6EARLZ VQLTVLBPME KPZFHDPZSB KXTAMAGHTZ SIMCSAAGTP TPSLVWVLFN 2580 

GTDLQSGQQL QRFYHKAD6M LHXSGXiSSVD AGAYRCVARN AAGHTERLVS LKVGLKPEAN 2640 

KQYHHLVSIZ NGETLKLPCT PPGAGQGRFS WTI»PNGMHLE GPQTIiGRVSL LDNGTLTVRE 2700 

ASVFDRGTYV CRMETBYGPS VTSIPVZVIA YPPRZTSBPT PVXYTRPGOT VKIjNC34AKGI 2760 

PKADZTHELP DKSBLKA6VQ ARIiYISIRFLB PQGSLTZQHA TQRSAGFYKC KMOnLGSDS 2820 
KTTYZHVF 

Seq ID NO: 418 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence t 1..5001 

1 11 21 31 41 51 

I I t 1 ) 1 

ATGCCA66CA CAAAACTAAC COGAACAGQC GCCCCA6CAG ACTACAGAGT GATATTGAAG 60 

AGCTCTCAAG AGSACQAATT GGATGTACCT OAGGACATCA GCGTCOG G GT T ATGTCAT CT 120 

GAGTCTGTGC TTGTGTCCTO GGTGGATCCT GTTCTGGAAA AACAGAAGAA AGTTGTTGCA 180 

TCAAGACA6T ACACOGTGCG CTATGGA6A6 AAGGGGGAAT TGGCXIAGGTG GGATTATAAG 240 

CAQATOSCTA ACAGGCGTGT GCTGATTGAG AACCTGATTC CAQACACTGT GTATGAATTT 300 

GCAGTCOGTA TTTCACAGGG TGAAAGAGAT 6GCAAATGGA GTAOGTCAGT CTTCCAAAGA 360 

ACAOCAGAAT CTGCCOCTAC CACAGCTOCT GAAAACTTGA AGGTCTGGCC AGTCAATG6C 420 

AAACCTACA6 TTGTOGCTGC ATCTTGGGAT OOQCTACCAG AGACTGAG6G GAAAGTGAAA 480 

GTCTGTCTGC TOQACACAGG ACTOTTTTCA 6TTTCCTCCT TCCAACCATC TGCCAAATCA 540 

TTTCAGAATA CATTCTTTCA TACGCCCOGG CTCTCAAACC ATTTGGAGCA AAGTCCCTCA 600 

CCTATCCrGG AGACACTACT TCT6CCCTG6 TGGAT6GTCT GCA00CTQG6 6AACGCTATC 660 

TTTTCAAAAT OOGGGCCACA AACAGGAQAO GOCTGGOACC TCACTCCAAA GCCTTCATTG 720 

TG6CTATGCC AACAAGAAT6 CA6CTG7ACC CAGAAGGATT TCA6TTGTCT AGC TTAC CTG 780 

ATOGATATCC AAAGCAAACA AGTTAATAAA GATCCACAAC TGGAAGGGAG TGTTTTTGGA 840 

CCATOTTTTC TTTTCTACTT CCTCACATTT ATGCTGGATA TTGGOGGCTT TTCCTTCATT 900 

ATOTGCTATG AAGACOCANN T6TTTCTTCT TTGACAGGCA ATTCTTTAAA ATCTGTTGCA 960 

GCCAGIAAGG 0Q6ATSTTCA GCAGAACACG GAGGACAATG GOAAACOOQA AAAACCIGAO 1020 

CCPTCCTCAC CTTCTCCCA6 AQCTCCaOCT TOCTCCCAAC ACCCCTCTGT GCCT6CITCT 1080 

CCCCAAGGGA GAAATGCCAA GGACCTTCTT CTTGACTTGA AGAACAAAAT ATTGGCTAAT 1140 

GGTOGGGOGC CCOGAAAACC CCAGCTTCGC GCCAAGAAGG CAGAGGAGCT GGATCTTCAG 1200 

TOGACAGAAA TCACTG6G6A GGAGGAGCTG GGTTCCOGGG AG6ACT0GCC CATGTCACCC 1260 

TCAGACAOCC AA6ACX»GAA AOGGAOCCTS AGGCGGGCAA QTAGACAOGG CCACTGGGTQ 1320 

GTT6CTCC06 6CAGGACTGC AGT6A6G60C CGGATGCCAO 08CT0CCC06 AAG6GAAG6C 1380 

GTAGATAAGC CTGGCTTTTC CXTOSCCACaS CAGCCCX3GCC CAGGGGCGCC CCCCTCGGCT 1440 

TOGGCCTCTC CTGCCCACCA OOOGTCCACC CA6GGCACCT CTCATOGTCC TTCCCTGCCT 1500 

6CCAGCTTGA AT6ACAA0SA CTTGGTGGAC TCAGAOGAAG ATGAGGGCXK: T6TGG6CTCC 1560 

CtCCMXCCA AGG60GCCTT OGOCCAGOCC GGGCCAGCXX: TGTCOCCCAG COSCCAGTCC 1620 

C0GTCCA6CG TTCTCOGCGA CAGAAGCTCT GTGCACCCOO 6CGCAAAGCC AGCCT060CG 1680 

GCGCGGAGGA CCCCCCATTC AGGGGCOGCA GAGGAAGATT CCA6TGCCTC AGCCCCACCC 1740 

TCAAGACTTT CTCCACCCCA TGGGGGATCA TCTOGGCTGC TGCCCACCCA GCCACACCTG 1800 

AGCTCTCCAC TTTCCAAGGG 0GGGAA8GAT 06TGAGGA0G CCCCAGCCAC CAACTCCAAT 1860 

GOGCCATCAC OGTCCACCAT GTCCTGCTOC GTCICTTCTC ATCTCTCOTC CAGQA06CAG 1920 

GTCTCTGA6G GAGCX3GAGGC TTCIGATGGT GAAAGCCAOQ GTGA0G60GA TAGGGAAGAC 1980 

GGCGGAAGGC AGGC3GGAGGC CAOGGCCCAG ACGCTGCGGG CCCGGCCTGC CTCTGGACAC 2040 

TTCCATTTGC TCAGACACAA ACCXriTTGCT GCCAAOGGGA GGTCTCCAAG CAGGTTCAGC 2100 

ATTGGGCG6G GACCTG66CT GCAGOCCTCC AGCTCCX3CAC AGTCGACTGT GOCCTCCOQA 2160 
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GCGCACCCCA GGGirCCXTTC TGACTCTGftT TCCCAOOCm AGCTTAGCTC AG6TAT0CAT 2220 

GGA6ACGAG0 AGGATGAGAA GCOGCTTOCT 6CCAO0CTTG TCAATQACOl GGT6CCTTCC 2280 

TCCrCCAGGC AGCCCATCrC C066GGCTGG 6AGGACTTAA GGAGAAGCCC 6CAGAGAGGG 2340 

GCCAGCCT6C ATOGGAAGGA ACCCftTCOCA GAGAACCCCA AATCCACA6G GGCAGATACA 2400 

CATCCTCAGG GCAAGTACTC CTCCCTG6CC TCCAA6GCTC AGGATQTTCA ACAGA6CACA 2460 

GAG6C6GACA 0GQA6G6TCIV TTCTCOCAAA GCACAGOCAG GGTOCACAGA C060CAGG08 2520 

TCeCCTGCTC GTCCTCCC6C AGCAGGGTCA CAGCA6CA7C 0C3UjTGTTCC CAGAAGGATO 2580 

ACACCOGGCC GGGCCCCAGA ACAGCAGOX CCTCCTCCXS5 TC3GCCA0GTC CCAGCACCAC 2640 

COGGGACXXX AGAGCRGAGA 0GCXXK3T0GG TCACCTTCCC AGCCCAGGCT CTCACTGACC 2700 

CAGGCOGGGC GGCCCOGCCC CyVOGTOGCAG GCCCGCTCCC ACTCCTCCTC GGACOCTTAC 2760 

AGGGOSASCT CCAGAGGQAT QCTCC0C3100 GCCCTOCAGA ACCAGQAOQA G6ATGCCCA0 2820 

G6CAGCTA06 A0GA06ACA8 CAGAGAAOTC 6AGGCCCAGG AIGTSOGOGC CCC060GCAC 2880 

GCCGOGOGOO CCAAGGAGGC AGCTGCGTCC CTTCCCAAGC ACCAGCAGGT GGAGTCTCCC 2940 

ACAGGOGCAG GGGCAGGTGG CGACCACAGG TCCCAG060G GACATGCGGC CTCCCOCGCC 3000 

A06CCCAGCC GACCC6G0G6 CCCCCAGTCC OGCX3CO0GGG TGCCCAGCAG GGCAGGGCOG 3060 

GOGAAGTGGG AGCCTCCTTC CAAG OGOCCC CTGTOC TC CA AOTO OCAGCA gTO QGTCTCA 3120 

GCCXSAGGACS A66AG6A66A GGAOS08GGG TTTTTTAAAG 600GGAAAGA AGAOCTTCTO 3160 

TCTTCCTCTG TGCCAAftGTG GCCCTCTTCC TCCACTCCCA GGGGOOGCAA AGA0GOCX5AT 3240 

GGGAGCCT06 CCAAGGAAGA QAGGGAGCCT GCCATCGCGC TTGCCCCTCG CGGAGGGAGC 3300 

CTGGCTCCTG 7GAAGCGACC TCTCCCCCCA CCTCCAGGCA GCTCCOCCAG GGCXTTCCCAC 3360 

GTCCCTTCCC GACCGCCGCC T06CAG06CT GCCACOGTGA GCCGGGTOGC GOOC AOOCAC 3420 

CGCTGGCCGC GGTACACCAC G0606CGCCV CXTTGGCQICT TCTCCACCAC CCOGATGCTG 3480 

TCCTTGOGCC AGAGGATGAT GCATGCCAGA TTCCGTAACC CTCTCTCCOS ACAGCCTGCC 3540 

AGACCCTCTT ACAGACAAGG TTATAATGGC AGACCAAATG TAGAAGGGAA AGTCCTTCCT 3600 

GGTA6TAATG GAAAACCX3AA TGGACAGAGA ATTATCAATG GCCCTCAAGG AACAAAGTGG 3660 

GTTGTGGACC TTGATCGTGG GTTAGTATTG AATGCAGAAG GAAGGTACCT CCAAOATTCA 3720 

CATG6AAATC CTCTTCGGAT TAAACTAGGA GGAGATGGTC GAACCATT6T AGATCTGGAA 3 7 BO 

GGGACCCCCG TGGTGAGTCC TGACX5GCCTC CCACTCTTTO GGCAGGGGCG ACATGGCACA 3B40 

CCTCTGGCCA ATGOCCAAGA TAAGCXAATT TTGAGTCTTG GAGGAAAGCC GCTGGTCGGC 3900 

TTGGAGQTCA TCAAAAAAAC CACCX3VTCCC CCTACXACTA (XATGCAGCC CACCACTACT 3960 

ACGACGCCCC TGCCTACCAC TACAACCCCG AGGCCCACCA CTGCCACCAC CATGCAGCCC 4020 

ACCACTACTA OGACGCCCCT GCCTACCACT ACACOGAGGC CCACCACTGC CACCACCCX3C 4080 

CGCAOGACCA (XAGGOGTCC AACAACCACA GTCC3GAACC31 CTACG06GAC AACCACCACC 4140 

ACCACCCCCA AACCCACCAC TCCCATCCCC ACCTGTCCCC CTGGGACCTT GGAACGGCAC 4200 

GAOGATGATO GCAACCTGAT AATGAGCTCC AATGGGATCC CAGAGTGCTA CGCTGAAGAA 4260 

GATGAGTTCT CAGGCTTGGA GACTGACACT GCAGTACCTA CGGAAGAGGC CTAOSTTATA 4320 

TATGAT6AA6 ATTATGAATT TGAGAOGTCA AGGCCACCAA CCACCACTGA GCCTTOGACC 4380 

ACTGCTACCA CAC0GAGG6T GA7CCCAGAG GAAGG080CA TCAGTTGCTT TCCTGAA6AA 4440 

GAATTTGATC TGGCTGGAAa GAAA08ATTT GTTGCTOCTT A0GT6AC6TA CCTAAATAAA 4500 

GACCCATCAG CCCOGTOCTC TCTGACTGAT GCACTGGATC ACTTCCAAGT GGACAGCCTG 4560 

6ATGAAATCA TCCCCAATGA CCTGAAGAAG AGTGATCTGC CTCCCCAGCA TGCTCCCCGC 4620 

AACATCAC08 TGGTG6C0GT GGAAGGTTGC CACTCATTTG TCATTGTGGA TTGOGACAAA 4660 

G0CAC0CCA8 GAGATTT6GT CACA6GTTAT TTGGTTTACA GTGCATCCTA TOAAOATTTC 4740 

ATCAGGAACA AGTTTTCCAC TCAASCTTCA TCAOTAACTC ACTTOCCCAT T6AQAACCTA 4800 

AAGCCCAACA 0GAG6TATTA TTTTAAAGTO CAAGCACAAA ATCCTCATGG CTA06GACCT 4860 

ATCAGCCCrr CGGTCTCArr TCTCACaSAA TCAGA2AATC CTCTGC7TGT TGTGAGGCCC 4920 

CCAGGCGGTG ACCTATCTGG ATCCCATTOG CTTTCAAACA TGATCCCAGC TACAOSGACT 4980 

GCCATGGA08 0CAATATGT6 AAGCQCAGST 06TAT0SAAA OTTOOIGGQA GTTQTTCTTT 5040 

GTAATTCACT GAGGTATAAA ATCTACCTCA GTGACAACCT GAAAOATACA TTCTAGAGCA 5100 

TTGGA6ACAG CTGG6GAAGA GGTGAAGACC ATTGOCAATT TGT6GATTCA CACCTTGATG 5160 

GAAGAACAGG GCCTCAGTCC TATGTAGAAG CCCTCCCTAC TATTCAAGGC TACTATCGCC 5220 

AGTAT06TCA GGA6CCTGTC AGGTTTGGGA ACAT066CTT OGGAAOCCCC TACTACTATQ 5280 

TG06CI00TA 0GAGTGTG6G GTCTCCATCC CTG6AAAGT6 GTAATCACAQ GAOCSITCATG 5340 

CTGCAAGCTT 6CCCTGCCCA GCCCCAGCAA CTAA6T0GCA CTAG0G6CT0 TQAGCAAAGA 5400 

CAGCCA6CAT GCTCAGCCCC GCTGCCCTA6 GTGCCAGGAA GGTCACAGAT GGACACTGGC 5460 

CATTCTGGTC ATCTCAGTCT GGAACTCAOT CCCACTTCTT GGCCTGGACA ATGAACAGGA 5520 

TTCA6TTTTG CTGTTAACTT TOCTTCTCTA CnT TTT' n ' G TTTGTTTGTA ATAGCACATC 5580 

CXAGAOACaiT CAOAAACCAG CAACTGATTC AGTGTGATTT OOCAGACTTT TTA06CATGA 5640 

AATTCGOACA CTTCAGTATT TCCAGGAATA GCATATGCAC GCTGTTCTTG CTTCATOGAA 5700 

'TGCTACATGC ' m Cl VT rri' TCTCATTTTG GATTTCTCCA AAACTAACTO AATTTAAGCT 5760 

TCAG6TCCCT TTGTATGCAQ TAGAAAGGAA TTATTAAAAA CACCAOCAAA GAAAATAAAT 5B20 

ATATCCTACT TGAAATTTAC TCTATGGACT TACCCACTGC TAGAATAAAT GTATCAAATC 5880 

TTATTTGTAA ATTCTCAATT TT6ATATATA TATGTATATA TGCATATACA TATCCACACT 5940 

TGTCTGCAAG AATATTGATT AAAATT6CTA AATTTGTACT TGTTCACCAA AAAAAAAAAA 6000 
AAAAAAA 

Seq ID NO: 419 Protein seqpience 
Protein Accession #: Eos sequence 

1 11 21 31 41 51 

I 1 I i I 1 

MPGTKLTRTG APADYRVIUC TSQEDELDVp DDISVRVMSS QSVLVSWVDP VLEKQKKWA 60 

SRQYTVRYRB KOBKAiCHDyK QIANRRVLIS NLIPDTVyEP AVRZSQGBRO GKHSTSVFQR 120 

TPESAPTTAP EEJUnmPVHG RPTWAASHD ALPETE6KVK VCLLDTGLFS V59FQFSAKS 180 

FQNTFFHTPR LSNHLEQSPS PILETLLLPH mVCSl/XM FSKSGPQTGB ASmLTPXPSti 240 

SLCQQECSCT QKDFSCLAYL IDIQTKQVNR DPQLBGSVFG PCFLPYFLTF MLDIGGFSFI 300 

KCyEDFVSSL TGtTSLXSVAA SXADVQGNTB XJNGKPEKPEP 8SPSPSAPAS SQBPSVPASP 360 

QGRNAXDUiL DIiXSKILAMG QAPRKFQLRA KKASELDXiQS TEZT6EEELS SRS>SPNSPS 420 

DTQDQKRTLR PPSBHGB5W APGRTAVRAR MPALPRRBBV DXPGPSXiATQ PRPGAPPSAS 480 

ASPAHHASTQ GTSHRPSILPA SLNDMDLVDS DEDERAVGSL HPKGAFAQPR PALSPSRQSP 540 

SSVLRDR8SV HPGAKPASPA RRTPH5GAAE EDSSASAPPS RLSPPBGGSS RLLPTQFBI^S 600 

SPLSRGGXDG EDAPATNSNA PSRSTMSSSV SSBl*SSRTQV SEGAEASDGE SH6DGDREDG 660 

aSQABATAQT LRARPASGHF HLLRHRPFAA NGRSPSRPSZ G&6PRLQPSS SFQSTVPSBA 720 

HPRVPSBSDS BPKLSSGIEG DEEDERPLPA TWMDHVP8S SRQPISBGWB DLSRSPQRQA 780 

SLERKEPIPE KPKSTGADTH PQGKY8SLAS KAQDVQQSTD ADTEGHSPKA QPQSTDREAS 840 

PARPPAARSQ QRPSVPRRMT PGRAPEQQPP PPVATSQBEP GPQ5BDAGRS PSQPRLSLTQ 900 

AGRPRPTSQ6 RSBSSSDPYT ASSRCaCiPTA LONQDS>AQO SYDDDSTBVB AQDVRAPABA 960 
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ARAKBAAASIi PKHQQVBSPT GA6AGS>BRS QRCTAASPAR PSRPGGFQSR ARVFSRAAFG 1020 

KSEPPSKRPL SSKSQQSVSA EDEESEDAGP PRG UKKB LLS SSVPXWPSSS TPRGGKDiADG 10 BO 

SLAKEEREPA XAXAPRGGSL APVKRPLPPP PGSSPRASHV PSRPPPRSAA TVSFVAGTHP 1140 

WPRYTTRAPP GHFSTTPMLS LRQRMMHARF RNPI*SRQPAR PSYRQGYNC3R PNVEGKVLPG 1200 

SNGKPNGQRI INGPQGTKWV VDZ^DRGLVLM AEGRYIiQDSH GNPLRIKZ/SG DGRTI VPLBG 1260 

TFWSPDGZiP LPGQGRSiCTP LANAQDKPZL SLG6KPLV6L EVIIQC TTHPP TTTKQPTTTT 1320 

TPLPTTTTPR PTTATTMQPT TTTTPLPTTT PRPTTATTRR TTTRRPTTTV jCl'l'VU Vl'm 1380 

TPKPTTPIPT CPPGTLBRHD DDGNLIMSSW GIPECYAEED EPSGLETDTA VPTBEAYVIY 1440 

DEDYBFBTSR PPTTTBPSTT ATTPRVIPBB GAISSPPEEE FDIAGRKRFV APYVTYUJia) 1500 

PSAPCSLTDA LDHPQVDSLD EIIPHDLKKS DLPPQHAPRN ITWAVBGCH SFVIVDMDKA 1560 

TPGDLVTGYL VYSASYCTFI RNKPSTQASS VTBLPZBiniK PMTRYVFKVQ AOKPHOYGPI 1620 
SPSVSFVTES DNPI1X4WRPP GGEXjSGSHSI* SNNIPATRTA HDCaM 

Seq ID KO: 420 DNA sequence 
nucleic Acid Accessicm fis NM_022743 
Oodizig sequence: 128.. 1237 " 

1 11 21 31 41 51 

I I I I i 1 

GTGGATTTTA GAGATACCTC CCCTCCTTCT GCTCACCTGC CTTGCAGTAA TTAAACTCTT 60 

TCTCTGCTGC AACACCCCTA CrGTTCTCOO TGTATTG6CT TTTCTGGGCA 6CA0GAA0GA 120 

AAAGCTGATG GGATGCTCTC AOTGCOGOGT 06CCAAATAC TGTAGTGCTA AGT6TCAGAA 180 

AAAAGCTTGG OCAOACCAGA AG0666AATG CAAATGCCTT AAAAGCT6CA AACCCAGATA 240 

TOCTCCAGAC TOOSTTOGAC rrCTTGGCAG A6TTGTCTTC AAACTTATGO AT0GA6CACC 300 

TTCAGAATCA GAGAAGCTTT ACTCATTTTA TGATCTGGAG TCAAATATTA ACAAACTGAC . 360 

TGAAGATAAG AAAGAGGGCC TCAGGCAACT CX5TAATGACA TTTCAACATT TCATGAGAGA 420 

AGAAATACA6 GATGCCTCTC AGCTOCCACC TGCCTTTGAC CilTrf GAAG CCTTT GCAAA 480 

AGTGATCTGC AACTCTTTCA OCATCTGTAA' TGOGGJUSATG CWSGAAGTTG GT6TTG6CCT 540 

ATATCCCAGT ATCTCTTTGC TGAATCACAG CTGTGAOOOC AACTGTTOGA TTGTGTTCAA 600 

TGGOCCCCAC CTCTTACTGC GAGCAGTOOG AGACATOGAG GTGGGA6A00 AGCTTCACCAT 660 

CTGCTACCTG GATATGCTGA TGACCAGTGA GGAGCGCCGG AAGCAGCTGA GGGACCAGTA 720 

CTGCTTT6AA TGTGACTGTT TCCGTTGCCA AACCCAG6AC AAGGATGCTG ATATGCTAAC 780 

T66TGAT6AG CAAGTATQGA AG6AAGTTCA AGAATCCCT6 AAAAAAATTG AAGAACTGAA 840 

G6CACACTGG AA6TGGGAGC AG G T T CTGGC CATGT6CCAG 6CGATCATAA GCAGCAATTC 900 

TCAAOGGCTT CXXX5ATATCA ACATCTACCA GCTGAAGGTG CTOGACTQCG COVTGGATGC 960 

CTGCATCAAC CTCGGCCTGT TGGAGGAAGC CTTGTTCTAT GGTACTC30GA CCATGGAGCC 1020 

ATACAGGATT TTTTTCCCAG GAAGCCATCC OCTCaGAGGG GTTCAACnXSA TGAAAGTTGG 1080 

GAAACTGCAG CTACATCAAG GCATGTTTCC CCAAGCAATG AAGAATCTGA GACTG6CTTT 1X40 

TGATATTATG AGAQTGACAC AT66CAGA6A ACACAGCCT6 ATTGAAGATT TQATTCTACT 1200 

TTTAGAAGAA TGCGAOGCCA ACATCAGAGC ATCCTAAGGG AACGCAOTCA GAGGGAAATA 1260 

0GGCX3TGTGT CTTTGTTGAA TGCXJTTATTG AGGTCACACA CTCTAT6CTT TGTTAGCTGT 1320 

GTGAACCTCT CTTATTGGAA ATTCTGTTCC GTGTTTGTGT AGGTAAATAA AGGCAGACAT 1380 

aGTTTGCAAA CCACAAGAAT CATTAGTTGT AGAQAAOCAC GATTATAATA AATTCAAAAC 1440 
ATTTGGTTGA GGATGCCAAA AAAAAAAAAA AAAAAA A 

Seq ID NOt 421 Protein eequence 
Protein Accession ft: KP_0735B0 

1 11 21 31 41 51 

I I 1 1 I I 

MRCSQCRVAK YCSAKOQKKA WPDHKRECKC LKSCKPRYPP DSVRLLGRW FKIMXSAPSE 60 

SBKLYSFYDL ESUIKKLTED KKBGliRQLVM TFQHFMRESI QDASQLPPAF DLFEAPAKVI 120 

aiSFTIQXAE MQBVGVGLYP SI8I1UIH8CD PNCSZVFNGP HLLLRAVRDI EVGEBLTICY 180 

LDMU4TSEER RKQIADQYCF BCDCFROQTQ DKDADMLT6D EQVHKEVQE8 LKKIEBLKAH 240 

WKWEQVLAMC QAIISSNSER LFDINIYQIiR VLDCANDACI HLGLLSEALF YGTRTMEPVR 30O 

IPFPGSHFVR GVCfVNXCVQKL QLHOGMPPQA MKHLRIAFDI KRVTBGBEBS UGOLIIiLLB 360 
ECDANIRAS 

Seq ID NO: 422 DHA sequence 

Nucleic Acid Accession «i 2]M_003014.2 

Coding sequence: 238.. 648 

1 11 21 31 41 51 

1 1 I I I I 

GG06GGTTCG CGCCCCGPAQ GCTGAGAGCT GOOGCTGCTC GTGCCCTGT6 TGCCAGAOGG 60 

0GGAGCTCCX3 C6SCCX3GACC COGCGGCCOC GCTTT6CTGC OGACTGGAGT TTGGGGGAAG 120 

AAACTCTCCT GCGCCCCAGA AGATTTCTTC CTCG60GAAG G6ACAG0GAA AQAT6A8GGT 180 

GGCAGQAAGA GAAG606CTT TCTG T CTGCC GGGGTOGCAB CGGGA6AGG6 CAGTGCCAT6 240 

TTCCrCTCCA TCCTAGTGGC GCTGTGCCTO TGGCTGCACC T6G06CTGG0 0GT60G0G6C 300 

GCGCCCTGOG AGGCGGTGCG CATCCCTATG TGCCGGCACA TGCXTCTGGAA CATCAC3GCGG 360 

ATGCCCAACC ACCTGCACCA CAOCACGCAG GAGAACGCCA TCCTGGCCAT CGA6CAGTAC 420 

GAGGAGCTGG TGGAOGTGAA CTGCAGCGCC GTGCTGCGCr TCTTCTTCT6 TGCCAOXSTAC 480 

GCQCCCATTT GCAOCCTGGA GTTCCTGCAC GACCCTATCA AQCC6T0«A QTOGGTGTGC 540 

CAAOGCGCGC G06ACGACTG 06A6CCCCTC ATQAAGAT6T ACAACCACAG CTGGCCCSGAA 600 

AGCCTGGCCT GOSAOGAGCT GCCTGTCTAT GACCGTGGOO TGTGCATTTC GCCTGAAGCC 660 

ATOGTCACGG ACCTCCCGGA GGATGTTAAO TGGATAGACA TCACACCAGA CATGAT6GTA 720 

CAGGAAAGGC CTCITGATGT TGACTGTAAA 0GCCTAA6CC CCGATG6GTQ CAAGTGTAAA 780 

AAGGTGAA8C CAACTTTGGC AAOGTATCTC AGCAAAAACT ACAGCTATGT TATTCATQCC 840 

AAAATAAAAG CTGTGCAGAG GAGTGGCTGC AA7GAGGTCA CAAC6GTGGT GGATGTAAAA 900 

GAGATCTTCA AGTCCTCATC ACCO^TCCCT CGAACTCAAG TCCOGCTCAT TACAAATTCT 960 

TCTTGCCAOT GTCCACACAT CCTGCCCCAT CAAQATGTTC TCATCATGTG TTAaSAGTGG 1020 

OGTTCAAGGA TGATGCTTCT TGAAAATTGC TTAGTrGAAA AATGGAGAGA TCAGCTTAGT 1080 

AAAAGATCCA TACAGTGGGA AGAGAGGCT6 CAGGAACAGC GGAGAACAQT TCAOGACAAO 1140 

AAGAAAACAG CCG0GCX3CAC CAGTOGTAGT AATCOCCCCA AACCAAAGGG AAAGCCTCCT 1200 

GCTCCCAAAC CA6CCAGTCC CAA6AAGAAC ATTAAAACTA GGAGTGCCCA GAAGAGAACA 1260 

.AACOOGAAAA GAGT6TGAQC TAACTAGTTT CCAAAGCG6A GACTTCCX3AC TTCCTTACAO 1320 

'gaTSAQGCIG G6CATTGCCT GGCZACAGCCT ATGTAAGGCC ATGTGCCCCT TGCOCIAACA 1380 
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ACXCACTGOV GtGCTCTTOV TAGACAOITC TTGCAGOITT TTTCTTAAG6 CTAT6CTTCA 
U ' mri ' Cm ' GTAAGCCATC ACAAGCCATA OI' GGTflGG lT TGCCCTTTG6 TACAGAAGGT 
GAGTTAAAGC TGGTGGAAAA GGCTTATTGC ATTGCATTCA GAtSTAACCTG TGTCCATACT 
CTAGAAGAGT AGGGAAAATA ATGCTTGTTA CAATTOGACX: TAATATGTGC ATTGTAAAAT 
AAAT6CCATA TTTCAAACAA AACAOGTAAT TTTTTTACAG TATGTTTTAT TAOCTTTTGA 
TATC T S ITG T T6CAATGTTA GTGAIOTTTT AAAAT6TGAT GAAAATATAA TGTTTTTAAfi 
AAG6AACAGT AGTGGAATGA ATGTTAAAAG ATCTTTATGT GTTTAtGGTC T6CAGAAG6A 
TTTTTGTCAT GAAAGGGGAT TTTTTGAAAA ATTAGAQAAG TAOCATATGG AAAATTATAA 

■ i -GrGr rr rrr tacc3uvtgac TTCAGTTTCt gtttttagct agaaacttaa aaacaaaaat 

AATAATAAA8 AAAAATAAAT AAAAAOGAGA G6CA6ACAAT GTCT6GATTC CTG M TT' n - G 
GTTACXrrGAT TTCCATGATC ATGATGCTTC TTGTCAACAC GCTCTTAAGC AGCACCASAA 
ACAGTGAGTT TGTCTC5TACC ATTAGGAGTT AGGTACTAAT TAGTTGGCTA ATGCTCAAGT 
ATTTTATACC CACAAGAGAG GTATGTCACT CATCTTACTT CCCAGGACAT CX310CCTGAG 
AATAATTTGA CAAGCTTAAA AATGGCCTTC ATGTGAGTGC CAAATTTTGT TTTTCPTCAT 
TTAAATATTT TCTTTGCCTA AATACATGTG AGAGGAGTTA AATATAAATG TACAGAGAGG 
AAAGTTGAGT TCCACCTCTG AAATGAGAAT TACTTCACAG TTGGGA7ACT TTAATCAGAA 
AAAAAGAACT TATTTOCAGC ATTTTATCAA CAAATTTCAT AATTGTGGAC AATTGGAGGC 
ATTTATTTTA AAAAACAATT TTATTGGCCT TTTGCTAACA CAGTAAOCAT GTATTTTATA 
AGGCATTCAA TAAATGCACA ACGCCXZAAAG GAAATAAAAT CCXATCTAAT CCTACTCTCC 
ACTACACAGA GGTAATCACT ATTAGTATTT TGGCATATTA TTCTCCAGGT GTTTGCTTAT 
GCACTTATAA AATGATTTGA ACAAATAAAA CTAOGAACCT GTATACAT6T GTTTCATAAC 
CTGCCTCCTT TGCTTGGCCC TTTATTGA6A TAAGTTTTCC TGTCAAGAAA GCAGAAACCA 
TCTCATTTCT AACAGCTOTG TTATATTCCA TAGTATGCAT TACTCAACAA ACTGTTGTGC 
TATTQGATAC TTAGGTSSTT TCTTCACTGA OVATACTGAA TAAACATCTC ACOGGAATTC 

Seq ID NOi 423 Protein sequence 
Protein Accession ft: HP 003005.1. 



PCTAJS02/12476 



1 
! 

MFLSILVAIX: 
YEELVDVNCS 
BSLACDEI*PV 
KKVXPTLATY 
SSCQCFHZZiP 
RRKTAGRTSR 



11 
I 

LWLHLALGVR 
AVLRFFFCAM 
YDRGVCZSPB 
LSmSYVIB 
HQOVLINOrB 
SNPPKPKGKP 



21 
I 

GAPCEAVRIP 
YAPICTLBFL 
AZVTDLPBDV 
AKIKAVQRSG 
HS5RMMLLBH 
PAPKFASPKK 



31 

I 

MCRHMPWNIT 
HDPIKPCRSV 
KHZDITFDMM 
QIBVTTWDV 
CLVBKHRDQZf 
KZKTRSAQKR 



41 



51 



RNPKHLHHST QEHAILAIEQ 
OQRARQDCEP LNKMYMHSHP 
VQERPLDVDC KRLSPDRCKC 
KEIFXSSSPI PRTQVPLZTH 
SXRSIQHEBR LQEQSRTVOD 
TNPKRV 



Seq ID NO t 424 DKA sequence 
Nucleic Acid Accession #: BC010423 
COdixig sequence: 248. .1760 ' 



1 
I 

CACAGOrraG 
AGCTAOGGCT 
CAAGTGOGAG 
TCTGCA6CG6 
TTCAACOITG 
GCTGCTACTG 
CGTGGTAACT 
CGG06AGCAA 
ACTAGOGCTA 
GGA6CAGC08 
6CAGGGGQAT 
GGGGOOGCTO 
ACTAGAAGAG 
0C0CAG06TG 
CT0C06CTCT 
GCAGCCACTG 
CATCCTCCAC 
GTGGCACATT 
CTCATACAAC 
CACTTT6G6C 
C3UVTGAGTTC 
CTCTGGGAAG 
ACrCTTGTTC 
G6CCCAGCAG 
COGGAGGCTG 
GAGAGCCGAG 
A6A6CXX3GA0 
TGAACTGCTG 
CAAACAGGCC 
CAATOGCATC 
CTAGGCCTGG 
ACACCCCCAT 
AACCCTTCTO 
CACTGTGTGT 
TGACTGTCCG 
AAGTGAACTG 

CAGACCCCAO 
CAGACCCA6G 
TCTCCTAOCA 
GAGGCTTGAA 
ACATATTTTC 
ACTTTTAATT 
TTTTATTTTT 



11 
I 

6AAGCAGCTC 
GGGTGTGTAG 
AGGCAAGAAC 
GCTCCCAGGG 
CCOC TQTCCC 
CTGGCATCAT 
GTGGTGCTGG 
GTGGGGCAAG 
CTGCACTCCA 
CG60CCCCAC 
QAGQOOQAQT 
06GCTCOGAG 
GGCCAGGGCC 
AOCTGGGACA 
GCT OO OQTCA 
ACTTGTQTQG 
GTOTCCTTCC 
GGCAGAGAAG 
7G6ACA0GGC 
TTTC3CCCCAC 
TOCTCAAGGO 
CAOGTGGACC 
TGOCTTCTGG 
ATGACCCAGA 
CATTCCCATC 
GGCCACCCTG 
GGC06CAGTT 
TCTOCAGGCT 
ATGAACCATT 
TACATCAATG 
CTCCTTCTGT 
TTCTTGCCGA 
TTCATCGGGA 
GTOCATGTGT 
TGGAGGGGTG 
T6GT6TATGT 
TGTGTCATGT 
AGCAGTATTA 
TGTG06GGCA 
CTTOGGAGCC 
CTCTTACA6A 
TQIAAATATA 
TTTTTCTTTT 
ATTTTTTTTT 



21 

I 

TGGGGGAGCT 
AAC3GGGGCCG 
TCTGCAGCTT 
AGATCTCGGT 
TQGGAGCG8A 
TTACAGGOOG 
GCCAGGACGC 
TGGCATGGGC 
AATACX3GGCT 
6CAACCC0CT 
ACGAGT6C03 
TGCTGGTGCC 
TGACCCTGGC 
CGGAGGTCAA 
CCTCAQAGTT 
TGTCCCATCC 
TTGCTGAGGC 
GAGCTATGCT 
TGGATGGGOC 
TGAGCACIQA 
ATTCTCAGOT 
TAGTGTCAGC 
TGGTGGTGGT 
AATATGA6GA 
ACAC66ACCC 
ATAGTCTCAA 
ACTCCACGCT 
CTGGGCG6GC 
TTGTTCAGGA 
6G0G6GGACA 
TGACATG6GA 
AGATQCTOCX: 
OGGCTCCAOC 
GCCTGTGTGA 
ACT G TGTOO G 
6CCA06GGAT 
GGCTGTGTGT 
ATGAT6CAGA 
TAGCTGGAGC 
ATGGGGGCAA 
AGOOCTCTGC 
CATGOQOOQG 
TTrTTTCTTO 
AGA6TTTGAG 



31 
I 

GGGAGCTCCC 
GGGCTGGGGC 
CCTGCCTTCT 
6GAACTTCAG 
GATGTGQGGG 
GTGCOCOGCG 
AAAACTGCCC 
TCGGGTGGAC 
TCATGTGA6C 
GQAGGGCTCA 
GGTCAGCACC 

AGCCTCCTGC 
AGG CaCA AOa 
CCACTTGGTQ 
TQGCCTOCTC 
CTCtGTGAGG 
CAAGTGCCTG 
TCTOCCCAGT 
GCACAOCOGC 
CACTGTGGAT 
CTCGGTOGTX; 
GGTGCTCATG 
6GAGCTGA0C 
CAG6AGCX3U2 
GGACAACAGT 
GACCAOGGTG 
CGAGGAGGAG 
GAATGGGACC 
CCTGGTCT6A 
GATTTTAOCT 
CATCCCACTG 
AATTGAGTCr 
GTGTT6ACTG 
TGGTGTGTAT 
TTGAGTGGTT 
GACCTCTGCC 
GGTTGGAG6A 
TGGAATCTGC 
GTGTGAAGCA 
CCTCTGGTGQ 
6A0CTTCTTG 
CCCTT TCCAT 
TCCAGOCTGG 



41 

I 

GATCA06GCT 
TGGGTCCCCT 
GGGTCAGTTC 
AAACX3CTGGG 
CCT6AG6CCT 
GGTGAGC7Y3G 
TGCTTCTACC 
GOGGGCGAAG 
C0G6CTTACG 
GTGCTCCTGC 
TT00C06C08 
TCACTGAATC 
ACAGCTGAGG 
TCCAGCOGTT 
CCTAGCC3GCA 
CAOGACCAAA 
GGCCTTGAAG 
AGTGAAGGGC 
GGGGTACGAG 
ATCTAOGTCT 
0TTCTT6ACC 
GTGG1X?GGT0 
TCCOGATACC 
CTQACCAG66 
COGGAGQAGA 
AGCTGCTCTG 
AGGGAGATA6 
GAAGATCAGG 
CTACGGGCCA 
CGCAQGCCTG 
CATCTT0GG6 
ACT6CTTGAC 
CTCCCACCAT 
ACTGTGTGTG 
TATGCTGTCA 
G0GT6GGCAA 
TGAAAAAGCA 
GAGAGGTGGA 
CTCOGGTGTQ 
GCCAGTCCCT 
CCTCTQGGCC 
CAGGAATACT 
TAGTTGTATT 
ACXSATATAGC 



1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
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2220 
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2340 
2400 
2460 
2520 
25B0 
2640 
2700 
2760 



60 
120 
180 
240 
300 



51 
1 

TCTTGGGGGT 
AGTGGAGACC 
CTTATTCAAG 
CAGTCTGCCT 
GGCT6CTGCT 
AGACCTCAGA 
GAGGGGACTC 
GCGCXXIAGGA 
AGGGCOGOGT 
GCAAOGCAGT 
GCAGCTTCCA 
CTQGTCCAGC 
GCAGCCCAGC 
CCTTCAAGCA 
GCATQAATGG 
GGATCACCCA 
ACCAAAATCT 
AGCCCCCTCC 
TGGATGGGGA 
GCCATGTCAG 
CCCA6GAAGA 
TGATCGCCGC 
ATOGOGGCAA 
AGAACTCCAT 
9T0CAG0GCT 
TGAT6AGTGA 
AAACACAGAC 
ATGAAGGCAT 
AGCCCACGG6 
GCTCCCTTCC 
GCCTCCTTAA 
CTTTACCTCC 
GCATGCAGGT 
T6T0GAGGG6 
TA7K!AGAGTC 
GACTGTCAGG 
GGTATTTTCT 
GACTGTGGCT 
AG6GAACCTG 
GGGTCAGOCA 
T6CIGCAT6T 
GCTCCGAATC 
TTTTATTTAT 
GAGACCCTGT 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
7B0 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620* 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 



J 
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CICTAAAAAA ACCAAAACCC AAAAAAAAAA AAAAAAAAAA 

Seq ZD NOt 425 Protein seguenee 
Protein Accession fts AAB10423 

1 11 21 31 41 51 

I 11 I 1 I 

MPLSZiGAQW GPEAWLLLLL LLASFTGRCP AGELETSDW TWLGQDAKL PC7YR<a)SGB 60 

QVGOVAMARV DAGEGAQELA LLHSKYGIjHV SPAYBGRVBQ PPPPENPUX3 SVLUINAVQA 120 

DBSBYSCRVS TFPAGSFQAR tRLRVLVPPZi PSLNPGPALE BGQGLTLAAS CTASG5PAPS 180 

VTNDTEVKST TSSRSFKHSR SAAVTSEFHL VPSRSMHGQP XiTCWSBFGL LQDQSITBIL 240 

BVSFLAEASV RGltBDQHLHB Z^IBQAMLKC LSBGQPPPSY NHTRUSGPLP SGVRVXXHrrb 300 

GFPPLTTEHS GIYVCBVSNE FSSRDSQVTV DVXi)PQa>SG KQVDLVSASV VWGVIAALL 360 

PCLLWWVL MSRVHRRKAQ QMTQKyEEEL TLTRQISKR LHSHHTDPRS QPEESVGLRA 420 

EGHPDSLKDN S5CSVMSEEP BGRSYSTLTT VREZETQTEL LSPGSGSAEZ EEDQDEGZKQ 480 
AMNBFVQEKG TLRAKPT6N6 lYZKGRGBLV 

Seq ID NO: 426 DliTA sequence 

Nucleic Acid Accession Si 2iM_003474.2 

Coding sequence: 3 7.. 3 03 6 

1 11 21 31 41 51 

I i ) 1 ) t 

CACTAAOGCT CTTCCTAGTC CCCGGGCCAA CTOQGACAGT TTGCTCATTT ATTGCAACGG 60 

TCAAGGC7GG CTTGTGCCAG AACX3GOG06C GCGG6A0GCA CGCACM3VCA 0GGGGG6AAA 120 

CrmTTAAA AATGAAAGGC TAOAAGAGCT CA60GG0GGC GCGGGCCXTTG CXSCGAGGGCT 180 

CC6GAGCTGA CTCX5CCGAGG CAGGAAATCC CTCOGGTOSC GACGCCCGGC CCCGCTCGGC 240 

GCXXGOGTGG GATGGT6CAG 0GCTC36CC6C CGGGCCXX5AC AGCTGCT6CA CTGAAOGCCG 300 

GCGA06ATGG CAGCGCGCCC GCTGCC06TG TCCCCX30CCX: GCGCCCTCCT GCTCGCCCTG 360 

GCCGGTGCTC TGCTOGCGCC CTGOGAGGCC CGA6G60TGA GCTTATGGAA OSAAGGAAGA 420 

GCTGAT6AA0 TTGTCAGTGC CTCTGTTCGG AGTGGGGACC TCTGGATCCC AGTGAAGAGC 480 

TTCQACTCCA AGAATCATCC AGAAGTGCTG AATATTCXSAC TACAAOGGGA AAGCAAAGAA 540 

CT6ATCATAA ATCTGGAAAG AAAT6AAGGT CTCATTGCXA GCAGTTTCAC GGAAACCCAC 600 

TATCTGCAA6 A0B6TACTGA TGTCTCCCTC GCTOQAAATT ACACGC5TAAT TCTGGGTCAC 660 

TGTTACTACC ATGGACATGT AOQGGGATAT TCTGATTCAG CAGTCftGTCT CAGCACGTGT 720 

TCTGGTCTCA GGGGACTTAT TGTGTTTGAA AATGAAAGCT ATGTCTTAGA ACCAATGAAA 780 

AGTOCAACCA ACAQATACAA ACTCTTCCCA GOGAAGAAGC TGAAAAGCX3T CCGGGGATCA 840 

TGIX38ATCAC ATCACAACAC AOCAAACCTC GCT6CAAAGA ATGTGTTTOC ACCACCCTCT 900 

CAGACAT068 CAA6AAG6CA TAAAAGA8AG ACCCTCAAGQ CAACTAAGTA TGTGGAGCTG 960 

6TGAT0GT06 CAGACAAC06 AGAGTTTCAG AGGCAAOGAA AAGATCTG6A AAAAGTTAA6 1020 

CAGC3GATTAA TAGAGATTGC TAATCACGTT GACAAGTTTT ACAGACCACT GAACATTC3GG 1080 

ATaSTGTTGG TAGGCGTGGA AGTGTGGAAT GACATGGACA AATGCTCTOT AA6TCAGGAC 1140 

OCATTCAOCA 60CTCCATGA ATTTCTGGAC TGGAG6AAGA TQAAfiCTTCT ACCTOGCAAA 1200 

TOCCATGACA AT6CGCAGCT TGTCAGTGG6 GTTTATTTCC AAGGGACCAC CAT06QCATG 1260 

GCCXXaUVTCA TGAGCATGTG CACGGCAGAC CAGTCTGQGG GAATTGTCAT GQACCATTCA 1320 

GACAATCCCC TTGGTGCAGC OGTGACCCTG GCACATGAGC TGGGCCACAA TTTCGGGATG 13 BO 

AATCATGACA CACTGGACA6 GGGCTGTAGC TGTCAAATGG CGGTTGAGAA AGGAGQCTGC 1440 

ATCATGAACO CTTCCAC060 GTACCCATTT CCCATGGTGT TCAGCAGTT6 CAGCAGGAAO 1500 

6ACTTGGAGA CCAGCCTG6A GAAAG6AATG 6G0GTG7G0C TGTTTAACCT 6CC06AAGTC 1560 

AGGGAGTCTT TGGGGGGCCA GAAGTGT6GG AACAGATTTO TGGAAGAAGG AGAGGAGTGT 1620 

6ACTGTC5GGG AGCCAGAGGA ATGTATGAAT CGCTOCTGCA ATGCCACCAC CTGTACCCTG 1680 

AA6C0GGA06 CTGTGTGCQC ACATGGGCTG TGCTGTGAAG ACTGCCAGCT GAAGCCTGCA 1740 

GSAACAGGOT 6CAG6GACTC CA6CAACTCC TGTGAGCTGC CA6AGTTCTG CACAG6GGGC 1800 

AGCCCTCACT GCCCAGCCAA CGTGTACCTG CAOGATOGGC ACTCAT6TCA GGATGT6GAC 1860 

GGCTACTGCT ACAATGGCAT CTGCCAGACT CAOGAGCAGC AOTGTGTCAC ACTCTGGG6A 1920 

CCAG6TGCTA AACCTGCCCC TGGGATCTGC TTTGAGAQAG TCAATTCTOC AGGTGATCCT 1980 

TATG6CAACT GTG6CAAAGT CTOSAAGAGT TCCTTTOCCA AATQ0QA6AT GAGAGATGCT 2040 

AAAT6T08AA AAATCCAGTO TCAAGGAOGT 6CCAGC0G6C CAGTC AT T G G TACCAATGCC 2100 

GTTOXXATAO AAACAAACAT CCC0CT6CAG CAAOGAGGCC GG A TTCTGTG CC3GGG6GACC 2160 

CACGTOTACT TG6GC6ATGA CATGCCGGAC CCAGGGCTTG TGCTTGCAGG CACAAAGTGT 2220 

GCAGATGGAA AAATCTGCCT GAATOGTCAA TGTCAAAATA TTAGTGTCTT TQGGGTTCAC 2280 

GAGTXnX^CAA TGCAGT6CCA OGGCAGAGGQ GTGTGCAACA ACAGGAAGAA CTGCCACTGC 2340 

GAGGCCCACT 6GGCA0CTCC CTTCTGTGAC AA G TTT GGC T TT66AG6AAG CACAGACAGC 2400 

G6CCCCATCC 6GCAA6CAGA TAACCAAGGT TTAACCATA6 GAATTCTGGT GACCATOCTG 2460 

' i ' G ' fCTl ' Cl ' i ' G CTGCCGGATT TGTGOTTTAT CTCAAAAGGA AGACCTTOAT ACGACTGCTO 2520 

TTTACAAATA AGAAGACCAC CAT7X5AAAAA CTAAGGTGTG TGCGCCCTTC CCG6CCACCC 2580 

OGTGGCTTCC AACCCTQTCA GGCTCACCTC GGOCACCTTO GAAAAGGCCT GATGAG6AAG 2640 

COGCCAGATT CCTACCXACC GAAGGACAAI CCCAOGAGAT TGCTGCA6TG TCAGAATGTT 2700 

GACATCAGCA GACCCCTCAA C66CCT6AAT GTCCCTCA6C CCCAGTCAAC TCAGGGAGT6 2760 

CTTCCTOCCC TCCACOQGGC. CCCAC6T6CA CCTAGCGTCC CTGCCAGACC CCTGCCAGCC 2820 

AA6CCT6CAC TTA6GCAGGC CCAGGGGACC TGTAAGCCAA ACCCCCCTCA GAAOCCTCTQ 2880 

CCTGCAGATC CTCP G GCCAG AACAACTCG6 CTGACTCATG CCTTGGCCAO GACCCCAOGA 2940 

CAATGG6AGA CTG6GCTC0G CCTGOCACCC CTCAOACCTO CTCCACAATA TCCACACCAA 3000 

GTGCCCAGAT CCACXXS^CAC CGCCTATATT AAGTGAGAAG C06ACA0CTT TTTTCAACAG 3060 

TGAAGACAQA AGTTTGCACT ATCTTTCAGC T0CAGTT06A GTTTTTTGTA CCAACTTTTA 3120 

GGATTTTTTT TAATGTTTAA AACATCATTA CTATAAGAAC TTTGAGCTAC TGCCX3TCAGT 3180 

GCTGTGCTGT GCTAT6GTGC TCTGTCTACT T6CACAGGTA CTTQTAAATT ATTAATTTAT 3240 

GCAGAATGTT 6ATTACAGTG CMalGOGCTG TAGTAG8CAT TTTTAOCATC ACTGAGTTTT 3300 

CXATGGCAGG AAGGCITGTT QTGCTTTTAS TATTTTAGTG AACTTGAAAT ATCC1TXTT6 3360 

ATGGOATTCT GGACAGGATG ' IvmX^TTT CTGATCAAGG CCTTATTGGA AAGCAGTCCC 3420 

OCAACTACCC OCAGCTGTGC TTATGGTACC AGATGCAGCT CAAGAGATCC CAASTAGAAT 3480 

CfCAGTTGAT TTTCTGGATr CCCCATCTCA GGCCAGAGCC AAQ6G6CTTC AGGTOCAGGC 3540 

TGTGTTTGGC TTTCAGG6AG GCCCTGTGCC OCTTOACAAC TGQC3U36CAG GCTCOCAGGG 3600 

ACACCTGGGA GAAATCTGGC TTCT660CAQ GAAQCTTT6G TGAGAACCTG GGTTGCAGAC 3660 

AGGAATCTTA AGGTGTAOOC ACACCAGGAT AGAGACTGGA ACRCTAGACA AGCCA6AACT 3720 

TQACCCTGAG CTGACCAGCC GTGAGCATGT TTGGAAGGGG TCTGTAGTGT CACTCAAGGC 3780 

G6T0CTTGAT AGAAATGCCA AGCACTTCTT TTTCT06CTG TOCTTTCZAG AGCACTGCXA 3840 
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CCAGTAGGTT ATTTAGCTTG GGAAAGGTOG ' l -GTTTCTST A AGAAACCTAC TGCCCAGGCA 3900 

CTGCAAACCG OCACCTCCCT ATACTGCTTO GAGCTGAGCA AATCACCACA AACTG TAATA 3960 

CAATCATCCT GTATTCAGAC AGATGAGGAC TTTCCATGGG ACCACAACTA TTTTCAGATG 4020 

TGAACCATTA ACCAGATCTA GTCAATCAAG TCTGTTTACT GCAAGGTTCA ACTTATTAAC 4080 

AATTAGGCAO ACTCTTTATG CTTGCAAAAA CTACAACCAA TG6AAT6TGA TGTTCATGGG 4140 

TATAGTTCAT GTCTGCTATC ATTATTOGTA GATATTOGAC AAAGAACCTT CTCTATGGGG 4200 

CATCCTCTTT TTCCAACTTG GCTGCAGGAA TCTTTAAAAG ATGCTTTTAA CAGAGTCTGA 4260 

ACCTATTTCT TAAACACTTG CAACCTACCT GTTGAGCATC ACAGAATGTG ATAAGGAAAT 4320 

CAACTTGCTT ATCAACTTCC TAAATATTAT GAGATGTGGC TTGGGCAGCA TCCCCTTGAA 4380 

CTCTTCACTC TTCAAATGCC TGACTAGGGA GCCATGTTTC ACAAGGTCTT TAAAG TGACT 4440 

AATGGCATGA GAAATACAAA AATACTCAGA TAAGGTAAAA TGCCATGATG Ud'CTUTCfT 4500 

CTGGACTGGT TTTCACATTA GAAGACAATT GACAACAGTT ACATAATTCA CTCTGAGTGT 4560 

TT7ATGAGAA AGOCTTCTTT TGGOGTCAAC AGTTITCCTA TGCTTTGAAA CAGAAAAATA 4S20 

TGTACCAAfiA ATCTTGGTTT GCCTTCCAGA AAACAAAACT GCATTTCACT TTCCCGGTX5T 4680 

TOCCCACTGT ATCTAGGCAA CATAGTATTC ATGACTATGG ATAAACTAAA CAOGTG ACAC 4740 

AAACACACAC AAAAGGCy^C CCAGCTCTAA TACATTCCAA CTCGTATAGC ATGCATCTGT 4800 

TTATTCTATA GTTATTAAGT TCTTTAAAAT OTAAAGCCAT GCTGGAAAAT AATACTGCTG 4860 

A6ATACATAC A6AATTACTG TAACTTGATTA GACTTGGTAA TTGTACTAAA GCCAAACATA 4920 

TATATACTAT TAAAAAGGTT TACAQAATTT TATQGTQCAT TACGTGGGCA TTCTCITTTT 4980 

AOATGCCCAA ATOCTTAGAT CTGGCATGTT AGCCXrXTCCT CCAATTATAA GAGCiATATGA 5040 
ACCAAAAAAA AAAAAAAAAA AA 

Seq ID NO: 427 Protein sequence 
Protein Accession St NP_003465 

1 11 21 31 41 51 

I . I I . . J . I . I 

MAARPLPV8P ARALLLALAG ALIiAPCBARO VSLHNEGRAD EWSASVSS6 DLWIPVRSFD 60 

SKNHPEVUIZ RLQRESKBLZ ZNLERNB6LI ASSFTETByL QDGTDVSLAR NYTVIIXSlCy 120 

YHGHVRGYSD SAVSLSTCSG LRGLIVFEME SYVLEPMKSA TNRYKLFPAK KbKSVRQSCG 180 

SHHNTPNLAA KNVFPPPSQT WARRHKRETL KATKYVELVI VADNREPQRQ GKDLEKVKQR 240 

LIBXAHHVDK FYRPLNZRZV LVGVEVWNDM DKCSVSQDPP TSIiHEFLDWR KMKLLPRKSH 300 

miAQXiVSGVY FQGTTI(91AP ZMSKCTADQS GGIVMDKSDH PLGAAVTLAH EU3BS?<BWB 360 

DTLDRGCSCQ MAVBKGGCIM HASTGYPPPM VPSSCSRKDL ETSLEKtSIGV CLFHLPBVRE 420 

SPGGQKOGNR FVEEGEEOX: GEPEECMNRC CKATTCTLKP DAVCAHGLCC EDOQLKPAGT 480 

ACRDSSNSCD LPEFCTGASP HCPANVYIiHD GHSOODVDGY CYHGICQTHE QQCVTLWGPG 540 

AKPAPGZCFB RVNSAGDPYG NCGKVSKSSF AKCBMRDAKC GKIQCQGGAS RPVIGTNAVS 600 

lETNIPLQOa GRILCS6THV YLGDDMFDPG LVLAGTKCAD GKICLNRQOQ IfZSVFGVHEC 660 

AMQCHGR6VC NNRXNCHCBA BWAPPFCDKF GPGGSTDSGP ZRQADNQGLT IGZLVTILCL 720 

LAAGFWYLK RKTLIRLLFT NKKTTIEKIiR CVRPSRPPRG FQPOQAHLGH LGKGUOIKPP 780 

DSYPPKDNPR RLLQCQNVDI SRPLNQIiJVP QPQSTQRVLP PLHRAPRAPS VPARPI»PAKP 840 

ALRQAOGTOC PNPPQKPLPA DPLARTTRLT HAIiARTPGQW ETGLRIAPLR PAFQYFHQVP 900 
RSTSTAYIK 

Seq ID KO: 428 DKA sequence 

Nucleic Acid Accession S: HM_003714 

Coding sequence t 135.. 1043 

1 11 21 31 41 51 

I I I I I I 

GAGGAGGAGG GAAAAGGOQA GCAAAAAGGA AGAGT06GAG GAGGAGGGGA AGCGGOGAAG 60 

GAGGAAGAGG AGGAGGAGGA AGAOGGGAGC ACAAAGGATC CAG6TCTGGC GACGGGAGGT 120 

TAATACCAAO AACCATGTGT GCOGAGCGOC TOGOCCAOTT CATOAOOCTG GCTTTOOTOT 180 

TGGCCACCTT TGACOSOGOG C06GG6AC0G AOGCCACCAA CXJCACCCJGAG GGTCCCCAAO 240 

ACAGGAGCTC CCAGCAGAAA GGCOGCCTGT CCCTGCAOAA TACA60GGAG ATCCAGCACT 300 

GTTTGGTCAA CGCTGGOGAT GTGQGGTGTO GCGTGTTTGA ATGTTT06AG AACAACTCTT 360 

6TGAGATT0Q GGGCTTACAT GOGATTTGCA TGACTTTTCT 6CACAACGCT G GAAAA TTTG 420 

AT6CCCAGGG CAAGTCATTC ATCAAAGACG CCTT6AAAT0 TAA60G0CAC QCTCT QOGO C 480 

ACAGGTTOOG CTGCATAAGC 06GAAGTGCC CX9GGCATCA6 G6AAAT0GTG TCCAGTTGC 540 

AOOGGGAATG CTACCTCAAG CACGACCTGT GCX50GGCTGC CCAOGAGAAC ACCOG GGTG A 600 

TAQTGGAGAT GATCCATTTC AAGGACTTGC TOCTQCAOGA ACCCTAOGTG GACCTOGTGA 660 

ACTTGCTGCT GACCTGTGGQ QAGGAGGTGA AGGAGGCCAT CACCCACAGC GTGCAflOTTC 720 

AGTGTGAGCA GAACTGGGGA AGCCTOTGCT CCATCTTGAG CTTCTGCAOC TOGOCOITOC 780 

AGAAOCCTCC CAOQOCOCCC CCCGAGCGCC AGCCCCAGGT GGACAOAACC AAQCTCTCCA 840 

GGGCCCACCA CGGGQAAGCA GGACATCACC TCCCAGAGCC CAGCASTAGG GAGACTGGCC 900 

QAG6TGCCAA GGGTGAGCGA GGTAGCAAGA GCGACCCAAA OQOCCATGCC C6AGGCAGAG 960 

TOGGGGGCCT TGGGGCTCAG GGACCTTCOG GAAGCAGOGA GTGGGAAGAC GAACACTCIG 1020 

AGTATTCTQA TATC0G0AG6 TGAAATGAAA GGCCTGGCCA CGAAATCTTT CCTOCAOQCC 1080 

GTCCATTTTC TTATCTATGG ACATTCCAAA ACATTTACCA TTAGAGAGGG GGGATGTCAC 1140 

A06CAGGATT CTGTGGGGAC TGTGGACTTC ATCGAC3GTGT GTGTTC60G6 AAOGGACAGG 1200 

IGAGATGGAS ACCCCTGGGG COGTGGGGTC TCAGGGGTGC CTGGTGAATT CTGC ACTTA C 1260 

AOGTACTCAA GGGAGCGCGC C0GCX3TTATC CTCGTACCTT TGTCTTCTTT CCATCTGTGG 1320 

AGTCAGTGGG TGTOSGCCGC TCTGTTGTGO OGGAGGTGAA CCAGGGAGGG GCAQGGCAAO 1380 

GCAG6GCCCC CAGAGCTGGG CCACACAGTG GGTGCTGGGC CrOGCCCCGH AGCTTCTGGT 1440 

GCAGCAGOCT CT GO iqCTGr CTCOGOGGAA GTCAG6GC3GG CTGGATTCCA GQACAGGAGT 1500 

QAAT6TAAAA ATAAATATC6 CTTA6AATGC AGGAGAAG66 T6GAGAGGAG GCAGGGGCOG 1560 

AGGGGGT6CT TGGT G CCAAA CTGAAATTCA GTTTCTTGTG TGGGGCCTTG OGGTTCAGAQ 1620 

CTCTTGGOGA GGGTGGAGGQ AOGAGTGTCA TTTCTATGTG TAATTTCTGA GCCATT6TAC 1660 

TGTCTGGGCT GGGGGGGACA CTGTOCAAGG GAGTGOCCCC TATGAGTTTA TATT TTAA CC 1740 

ACTGCrrCAA ATCTOGATTT CACTTTTTTT ATTTATOCAG TTATATCTAC ATATCTGTCA 1800 

TCIAAATAAA TOGCTTTCAA ACAAAOCAAC TGCGTOITTA AAACCAGCTC AAAGGGGGTT 1860 

TAAAAAAAAA AAAACCAGCC CATCCTTTGA GGCTGATTTT TCTTTTTTTT AAGTTCTATT 1920 

TTAAAAGCTA TCAAACAGOG ACATAGCCAT ACATCTGACT GCCTGACATG GACTCCTGCC 1980 

CACTTGGGGG AAACCTTATA CCC3V3AGGAA AATACACACC TCGGGAGTAC ATTTGACAAA 2040 

TTTGCCTTAG GATTTOGTTA TCTCACCTTG A0CCTCA6CC AAGATTGGTA AAGCTGCGTC 2100 

CTGGOaVTTC CAOGABACCC AGCTGGAAAC CT UttCi T CTi; CATGrGAGGQ ^TGGGAAAG 2160 

GAAA6AAGA0 AATOAAGACT ACTTAGTAAT TOC3CATCA66 AAAT6CIGAC CTTTTACATA 2220 
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AAKTGAA6GA GACTGCIGAA A21TCTCTAAG QGACnCGhTT TTCCAGATCC TAATTGGAAA 2280 

TTTAGCAATA AG6AGAGGAG TCCAA6G0GA CAAATAAA06 CAGAiaGhGA GACAGAGAGA 2340 
G6GAGAG6AA GAAAAGAGAG AGAGAAAAGA G0CTGST6CC 

Seq ID KOx 429 Protein sequence 
Protein Accession #: HP_003705 

1 11 21 31 .41 51 

I I I i I 1 

MCAERIiGQFM TUUiVLATFD PARGTDATNF PEGFQSRSSQ QKGRLSIX2NT ABIQHCLVNA 60 

GDVGGSVFEC FENNSCBZRG LHOZCMTFUI NAGKFDAQGK SFZXDAIiRCR AHALRHRFGC 120 

XSRKCPAXRB MVSQIiQRBCy LRHDLCAAAQ Q]TRVZVEMZ RPlCDUiLBEP YVDLVNLZiLT 180 

CGEEVKBAZT B5VQVQCBQM HGSLCSZIiSP CTSAZQKPPT APPERQPQVD RTKLSRABKO 240 

EAGBBLPSP8 SREKSGAKG ESGSXSHFNA KARSRVGOLG AQGPSGSSEM BDBQSEY6DZ 300 

RR 

Seq ZD NOt 430 DNA sequence 
Kucleic Acid Accession Ui IIM_00S940 
Coding sequence: 23.. 1489 

1 11 21 31 41 51 

1 ) I 1 I I • 

AAGCCCAGCA GCCCOGGGOC GGATGGCTCC GGCOGCCTGG CTCCGCAGOG CGGCC3GCGCG 60 

OGCCCTCCTG CCCCCGATGC TGCTGCTGCT GCTCCAGCOG CCGCX3GCTGC TGGCXXX3GGC 120 

TCTGCCGCCX3 6ACGTCCACC ACCTCCATGC OGAGAGQAGG GGGCCACAGC CCTGOCATGC 180 

AGCXXTTGCCC AGTAGCCOGG CACCTGCCCC TGCCACX3CAG GAACCCCCCC GGCCTGCCAC 240 

CAGCCrCAGG CCTCCCOGCT GTGGCGTGCC CGACCCATCT GATGGGCTGA GTGCCOGCAA 300 

COGACAGAAG AGGTTCGTOC TTTCTGGCGG GOGCTGGGAG AAGAC3GGACC tCACCTACAG 360 

GATCCTTCGG TTCCCATGGC AGTTGGTGC3V GGAGCAGGTG OGGCAGACGA TGGCAGAGGC 420 

CCTAAAGGTA TGGAGCGATG TGACGCCACT CACXTTTTACT GAG6TGCAC0 AGGGCOGTGC 480 

TGACATCATG ATC6ACTTCG CCAGGTACTG GCATGGGGAC GACCTGCCGT TTGATGGGCC 540 

TGGGGGCATC CTGGCCCATG CCTTCTTCCC CAAGACTCAC CGAGAAGGGG ATGTCCACTT 600 

GGACTATGAT GA6ACCT6GA CTATCQGGGA TGACCAGGGC ACAGACCTGC TGCAGGTGGC 660 

AOOCCATOAA TTTQGCCAOG TGCTGGGGCT GCAGCACACA ACAGGAGCCA AGGCCCTGAT 720 

GTOOGOCTTC TACACCTTTC GCTACCCACT GAGTCTCAGC CCAGATGACT GCAGGGGOGT 780 

TCAACACCTA TATGGCXIAGC CCTGGCCCAC TGTCACXrrCC AGGACCCCAG CCCTGGGCCC 840 

COVOGCTGGG ATAGACACX31 ATGAGATTGC ACOGCTGGAO CCAGA06CCC CGCCAGATGC 900 

CIGXGAQGCC T0CTTTQAG6 CGGTCTCCAC CATCOGAGGC GAGCTCTTTT TCTTCAAAGC 960 

GGGCTTTQTQ TGGCBCCTCC GTQQQG6CCA OCTGCAGCCC GGCTAOCCAG CATTGGCCTC 1020 

TCGCXACPGQ CAGGGACTGC CCAGCCCTGT 6GACGCT6CC TTCGAGGATG CCCAGGGCCA 1080 

CATTTGGTTC TTCCAAGGTG CTCAQTACTG GGTGTACGAC GGTGAAAAGC CAGTCCTGGG 1140 

CCCC6CACCC CTCACCGAGC TGGGCCTGGT GAGGTTCC06 GTCXIATCCTG CCTTGGTCTG 1200 

GGGTC006AG AAGAACAAQA TCTACTTCTT OOQAOGCAGG QACTACTGGC GTTTCCACCC 1260 

CAGCA0006G OGTGTAGACA GT0C06TGCC C08CA0G6CC ACTQACTGGA 6AG0GGTGCC 1320 

CTCTGAGATC GAC3GCTGCCT TCCAGGATGC TGATGGCTAT GCCTACTTCC TGCGCGGCOG 1380 

CCTCTACTGG AAGTTTGACC CTGTQAAGGT GAAGGCTCTG GAAGGCTTCX CCOSTCTOGT 1440 

GG6TCCTGAC TTCTTTGGCT GTGCCGAGCC TGCCAACACT TTOCTCTGAC CATG6CTTGG 1500 

ATGCOCTCAG G66T6CTGAC CCCTGCCAGG OCAOQAATAT CAOQCTAGAG ACGCATG6CC 1560 

ATCTTTOTOG CTGTOQGCAC CAG6CATGG0 AC7X3AGOCCA TgTCTCCTGC AGGGGGAT6G 1620 

GGTGGGGTAC AACCACCATG ACAACTGCCG GGAGGGCCAC GCAGGTCGTG GTCACCTGCC 1680 

AGCGACT6TC TCAGACTGGG CAGGQAGGCT TTGGCATGAC TTAAGAGGAA GGGCAGTCTT 1740 

GGGACXXGCT ATGCAGGTCC TGGCAAACCT GGCTGCCCTG TCTCATCCCT GTCCCTCAGG 1800 

GTAGCACCAT G6CAG6ACT6 666GAACTGG A6TGTCCTTG CTGT A TCCCT GTTOTOAGGT 1860 

TCCTTCCAG6 GGCT66CACT GAAGCAAGGG TGCTGGGGCC CCATG6CCTT CA6CCCTGGC 1920 

TGAGCAACTG GGCTGTAGOG CAGGGCCACT TCCTGAOGTC AQGTCTTGGT AGGTGCCTGC 1980 

ATCTGTCTGC CTTCTQOCTO ACAATCCTGG AAATCT6TTC TCCAGAATCC AGGCCAAAAA 2040 

GTTCACAGTC AAATGGGGAG GGGTATTCTT CAT6CAGGAG ACCCCAGGCC CTGGAGGCTG 2100 

CAACATAOCT CAATCCT8TC CCAQGCOGGA TOC TOCTG AA G0CCTTTT08 CA QCACTG CT 2160 

ATCCTCCAAA GCCATTGTAA ATGTGTGTAC AGTGTGTATA AAOCTTCTTC TTCTTTTTTT 2220 
TTTTTAAACT GAGGATTGTC ATTAAACACA OTTOTTTTCT 

Seq ID NO: 431 Protein sequence 
Protein Accession 6: NP_00S931 

1 11 21 31 41 51 

I i I I I I 

MAPAAHLRSA AARALLPPML LLLLQPPPZtL ARALPFDVHH LHAERRGPQP WHAALPSSPA 60 

PAPATQEAP& PASSLRPPRC GVPDPSDGLS ARNRQKSFVL SGGRHEKTDL TYRIIiRFPWQ 120 

LVQEQfVRjQTM ASALKVWSDV TPLTFTEVEB 6RADIKIDFA RYWHODDLPP D6PG6ZLABA 180 

PFPKTHRBGD VHPDYDETWT IGDDQGTDLL QVAAHEFGHV LGIiQHTTAAK ALMSAFYTPR 240 

yPLSLSPDDC RGVQHLYGQP WPTVTSRTPA LGPOAOIDTN BIAPLEPDAP PDACEASFDA 300 

VSTZRGBZiFF FRAGFVHRLR GGQLQPG7FA LASRHHQGLP 8PVDAAFEDA QGHIWPPQGA 360 

QYNVYDGSKP VLGPAPLTSb GLVRPPVBAA tiVHGPEXNKZ yPFRGRDYHR PEPSTRRVDS 420 

PVPRSATDHR 6VPSBIQAAF QDADGVAYPL RGRLYHXFDP VKVKALEGFP RLVGPDFFGC 480 
AEPANTFL 

Seq ID NO: 432 DNA Sequence 

Nucleic Acid Accession S: HM_024022 

Coding sequence > 202.. 1563 " 

1 11 21 31 41 51 

I I I 1 t I 

ACQGQOCACC GGAOGGCTC96 GGTACTTTC6 TTCTTAATCA GGTCAT6CCC GTGTGAGCCA 60 

GGAAAGGGCT GTGTTTATGG GAAGCCAGTA ACACTGTOGC CTACTATCTC TTCCGTGGTG 120 

CCATCTACAT TTTTGGGACT CGGGAATTAT 6AGGTAGAG6 TGGAGGCGGA GOOGGATGTC 180 

AGAQGTCCTG AAATAGTCAC CATGGGGGAA AATGATCCGC CrGCTGTTGA AGCCCCCTTC 240 

TCATTCOGAT CGCTTTTT6G CCTTGATQAT TTGAAAATAA GTOCTGTTGC ACCAGATGCA 300 
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GATCCTGTTO CTGCACAGAT CCTGTCACTG CTGCOITTGA AGTTrrTTCC AATCATCGTC 360 

ATTGGGATCA TTGCATTGAT ATTAGCACTG GCCATTGGTC TGGGCATCCA CTTOCSACTGC 420 

TCAGGGAAGT ACAGAT6TOG CTCATCCTTT AAGT6IAT00 AGCT6ATAGC TOGATGTGAC 480 

G6AGTCTCGG ATTGOVMGA OQQGGAGGAC GAGTAOOQCT qi G TCCOGCT GOQTGGTCAG 540 

AATOCOGTGC TCCAGGTGTT CACAGCTGCT TC3GTQGAAGA CCATGTGCTC OGATGACTGG 600 

AAGGGTCACT ACX3CAAATGT TGCCTGTGCC CAACTGGGTT TCCCAAGCTA TGTGAGTTCA 660 

GATAACCTCA GAGTGAGCTC GCTGGAGGGG CAGTTCCGGG AGGAGTTTCT GTCCATCGAT 720 

CACCTCTTGC CAGATGACAA GGTGACTGCA TTACACCACT CAGTATATXTT GACX3GAGGGA 780 

TGTGCCTCTC GCCAOGTGGT^ TACCTTGCAG TGCACAGCCT GTGGTCATAG AAGGGGCTAC 840 

AGCTCACGCA TOGTGGGTGg' AAACATGTCC TTGCTCTOGC AGTGGCCCPG GCAGGCCAGC 900 

CTTCAGTTCC AGGGCTACCA CXTGTGCGGa GGCTCTGTCA TCACGCCCCT GTGGATCATC 960 

ACTGCT6CAC ACTGTGTTTA TGACTTGTAC CTCCCCAAGT CATGGAOCAT CCAGGTGGGT 1020 

CTAGTTTCCC TGTTGGACAA TCCAGCCCCA TCCCACTTGG TGGAGAAGAT TGTCTACCAC 1080 

AGCAAGTACA AGCCAAAGAG GCTGGGCAAT GACATOGCCX: TTATGAAGCT GGCCGGGCCA 1140 

CTCAOGTTCA ATGAAATGAT CXAGOCTGTG TGCCTGCCCA ACTCTGAAGA GAACTTCCCC 1200 

GAIGGAAAAG TGTGCT6QAC GTCAGGATG6 GGGGGCACAG AGGATG6AGG TGAOGCCTCC 1260 

CCT6TCCTGA ACCACGOGGC CGTUCCmC ATTTCCAACA AGATCTGCAA CCACA6GGAC 1320 

GTGTAC66T0 OCATCATCTC CCCCTCCATG CTCTGOGCGG GCTACXTGAC GGGTGGCGTG 1380 

GACAGCTGCC AGGGGGACAG CGGGGGGCCC CTGG7X3TGTC AASAGAGGAG GCTGTGGAAG 1440 

TTA6TGGGAG 06ACCAGCTT T6GCAT06GC TGOGCAGAGG TGAACAAGCC TGGGGTGTAC 1500 

ACCCG7GTCA CCTOCTTCCT GGACTGGATC CAOSAGCAGA 7GGAGA6AGA OCTAAAAAOC 1560 

TGAAGAG6AA GGGGACAAGT AGOCACCZGA QTTOCTOAgG TGATGAAGAC AGOCCGATOC 1620 

TCOCCTGGAC TCCQtjTGTAQ GAACCTGCAC AOGAGCAGAC ACCCTTGGAG CTCTGAGTTC 1680 

OGGCACCAGT AOCAGGCCOG AAAGAGGCAC CCTTCCATCT GATTCCAGCA CAACCTTCAA 1740 

GCTGCTTTTT GTTTTTTGTT TTTTTGAGGT GGAGTCTOGC TCTGTTGOCC AGGCTG6ACT 1800 

GCAGTG60GA AATCCCTGCT CACTGCAGOC TCOGCTTOOC TGGTTCAAGC GATTCTCTTG 1660 

CCTCAGCTTC CCX»GTAGCT GGGACCACAG GT6CC0GCCA CCACACOOUl CTAATTTTrG 1920 

TATTTTTAGT AGAGACAGGG TTTCAOCATQ TTGGOCAGGC TGCTCTCAAA CCCCTGACCT 1980 

CAAATGATGT GCCTGCTTCA GCCTCCCACA GTGCTGGGAT TACAGGCATG GGCCACCAOG 2040 

CXTTAGCCTCA COCTCCTTTC TGATCTTGAC TAAGAACAAA A6AAGCAGCA ACTTOCAAGG 2100 

GOGGCCTTTC GCACT6GTCC ATCTGGTTTT CTCTCCAGGG GTCTTGCAAA ATTGCTGAOQ 2160 

AGATAAGCAG TTATGTGACC TCA08TGCAA AGCCACCAAC AGCCACTCAG AAAA6ACGCA 2220 

CCAGCCCAGA AGTGCAGAAC TOCAGTCACT GCAOSTTTTC ATCTCTAGGG ACCAGAACCA 2280 

AACCCACCCT TrCTACTTCX: AAGACTTATT TTCACATGTG GGGAGGTTAA TCTAGGAATC 2340 

ACTOSTTTAA GGCCTATTTT CATGATTTCT TTGTAGCATT TGGTGCTTGA CGTATTATTG 2400 

TCCTTTGATT CCAAATAATA T G TTTH3C1 TC CCTCAAAAAA AAAAAAAAAA AAAAAAAAAA 2460 
AAAAA 

Seq ID KO: 433 Protein sequence 
Protein Accession $: KP_076927 

1 11 21 31 41 51 

11)111 

KGENDPPAVE APFSFRSLFG LODZJCISPVA PDADAVAAQI LSLLPLKFFP IIVIGIIALI 60 

LAIiAIGLGIH FDCSGKYRCR SSFKCZELIA RCDGVSDCKD GEDEHfRCVRV GGQKAVLQVF 120 

TAASWKIMCS nOWKGHYANV ACAQl/^FPSY V88DNLRVSS LEGQFREEFV SIDBLLPDDX 180 

VTALEESVYV RB6CAS6BW TLQCTAOGaR RGYS5RZVG0 MMSLLSQWPW QASLQFQOyR 240 

IiCGGSVITPL WIITAAHCVY DLYLPKSWTI QVGLVSIiLDN PAPSHLVEKI VYHSKYKPKR 300 

IX3NDIALMKL AGPLTFNEHI QPVCLFNSEB NFPD6KVCWT SGWGATEDGG DASPVUIHAA 360 

VPLISNKICN HRDVYGGIXS PSHLCAG7LT G6VDS0QGDS G6PLV0QERR LWKLVGATSF 420 
GIGCAEVNKP GWTEVTSPL DWIHBC94ERO LKT 

Seq ZD NOt 434 DNA sequence 

Nucleic Acid Accession §t NN_000493.2 

Coding sequence; 97. .2139 

1 11 21 31 41 51 

I I I 1 I 1 

CACCrrCTGC ACTGCTCATC TGGGCAGAGG AAGCTTCAGA AAGCTGCCAA GGC3VCCATCT 60 

CCAGGAACTC CCA6CACGCA GAATCCATCT GAQAATATGC TOCCACAAAT ACCCTTTTTG 120 

CTGCTAGTAT CCTTGAACTT GOTTCATGGA GTGTTTTACG CTGAACX3ATA CCAAATGCCC IBO 

ACAGGC3W:AA AAOGCCCACT ACCCAACACC AAGACACAGT TCTTCATTCC CTACACCATA 240 

AAQAGTAAAG GTATAGCAGT AAGAGGAGA6 CAA6GTACTC CTGGTCCACC AGGCCCTGCT 300 

GGACCTOGAG GGCACTCAGG TCCTTCTGGA CCAOCAGGAA AACCAGGCTA CGGAAGTCCT 360 

GGACTCCAAG GAGAGCCAGG GTTGCXZAGGA CCACOGQGAC CATCAGCTGT AGGGAAACX3V 420 

GGTGTGCCAG GACTCCCAGG AAAACCAGGA GAGAGAGGAC CATATG6ACC AAAAGGAGAT 480 

GTTG6ACCA0 CTGGCCTACC AGGACCCCGG GGCCCACCAG GACCACCTGG AATCCCTGGA 540 

C06GCTGGAA TTTCIGTOCC AGGAAAACXT GGACAACAG6 GACCCACAGG AGCCCCAGGA 600 

CCCA660GCT TTCCTGGAGA AAAG6GTGCA OCAGGAGTCC CTGOTATGAA TGGACAGAAA 660 

G66GAAATGG GATATGGTGC TGCTGGTOGT CCAGGTGAGA GGGGTCTTCC AGGCCCTCAG 720 

GGTCCCACAG GACCATCTGQ CCCTCCTGGA GTGGGAAAAA GAGGTGAAAA TGGGGTTCCA 780 

GGACAGCCAG GCATCAAAGG TGATAGAGGT TTTCCX3GGAG AAATGGGACC AATTGGCCCA B40 

CCAGGTCCCC AAG6CCCTCC TGGG6AA06A GGGCCAGAAG 6CATTGGAAA GCCAGGAGCT 900 

GCTGGAGCXX: CAG6CCAGCC AG06ATTCCA G6AACAAAAG GTCTCCCTGG G6CTCCA0GA 960 

ATAGCTGGGC CCCCAGGGCC' TCCTGGCITT GGOAAACCAG GCTTGCCAGG CCTGAAGGGA 1020 

GAAAGAGGAC CTGCTCGCXTT TCCTGGGGGT CCAGGTGCCA AAGGGGAACA AGG6CCAGCA 1080 

GGTCTTCCTG 6GAAGCCAGG TCTOACTGGA CaXCTGGGA ATATY3GGACC CCAAGGACCA 1140 

AAAG6CAT0C OQGGTAOCCA TGGTCTCCCA 6GCCCTAAAG GTGAGACAG6 GCCAOCTGGG 1200 

CCTGCAOGAT ACCCTGGGGC TAAOGGTGAA AGGGGTTCCX: CTQQGTCAaA TGGAAAACCA 1260 

GGGTACCCAG GAAAACC3W3G TCTCGAT6GT CCTAAGGGTA ACCCAGGGTT ACCAGGTCCA 1320 

AAAGGTGATC CTGGAGTTGG AGGACCTCCT GGTCTCCCAG GCCCTGTGGG CCCAGCAGGA 1380 

GCAAAOGGAA TGCCCGGACA CAATGGA6AG GCTGGCCCAA GAGGTGCCCC T6GAATACCA 1440 

GGTACTAOAO 600CTATT66 6CCACCAGGC ATTCCAGGAT TCOCTQGGTC TAAAGGOOAT 1500 

CCAGGAACTC CC6GT0CTCC TG6CCCAGCT GGCATAGCAA CTAAG6GCCT CAATOGACCC 1560 

ACCCGGCCAC CAGGGCCTCC AGGTCCAAGA GGCCACTCTG GAGAGCCTGG TCTTCCAGGQ 1620 

CCCCCTGGGC CTCC3WX3CCC ACCAGGTCAA GCA6TCATQC CTGAGGGTTT TATAAAGGCA 1680 

6GCCAAAGGC CCAGTCTTTC TGGGACCOCT CTTGTTAGT6 CCAACCAGGG GGTAACAGGA 1740 
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ATGCCTGTGT CTGCTTTTAC TGTTATTCTC TCCSUUVGCTT ACCCAGCAAT AGGAACTCCC 
ATACCAnTG ATAAAATTTT GTATAACAGG CAACAGCATT ATGACCCAAG GACTGGAATC 
TTTACTTGTC AGATACCAGQ AATATACTAT TTTTCATACC AOSTGCATGT GAAAGOtSACT 
CATGTTTGGG TAGGCCTGTA TAAGAATGGC ACCCCTGTAA TGt&CKCCTA 10AT6AKXAC 
ACCAAAtSGCT ACCTGGATCA GGCTTCAGGG AGTGCCATCA TOGATCTCAC AGAAAATGAC 
CAGGTGTOGC TCCAGCTTCC CAATGCOGAO TCAAATGGCC TATACTCCTC TGAGTATGTC 
CACTCCTCTT TCTCAGGATT CCTAGTGGCT CCAATGTGAG TACACCCCAC AGAGCTAATC 
TAAATCTTGT GCTAfiAAAAA GCATTCTCTA ACTCTACCOC ACCCTACAAA ATGCATATGG 
AGGTAGGCTQ AAAAGAATGT AATTTTTATT TTCTGAAATA CAGATTTGAQ CTATCft GACC 
AAC3UUICCTT CCCCCTGAAA AGTCAGCAGC AACGTAAAAA CGTATGTGAA GCCTCTCTTG 
AATTTCTA6T TAGCAATCTT AAGGCTCTTT AAGGTTTTCT CCAATATTAA AAAATATCAC 
CAAAGAAOTC CSO C C A HOn AAAAACAAAC AACAAAAAAC AAAGCAACAA AAAAAAAAAT 
TAAAAAAAAA AACAGAAATA GAGCTCTAAG TTATGTGAAA TTTGATTTGA GAAACTCGGC 
ATTTCCTTTT TAAAAAAGCC TGTTTCTAAC TATGAATATG AGAACTTCTA GGAAACATCC 
AGGAOGTATC ATATAACTTT GTA6AACTTA AATACTTGAA TATTCAAATT TAAAAGACAC 
TGTATCCCCT AAAATATTTC TGATGGTGCA CTACTCIGAG GCCTGTATGG CCCCTTTCAT 
CAATATCTAT TCAAATATAC AGQTQCATAT ATACTTeTTA AAGCTCTTAT ATAAAA AAGC 
CCCAAAATAT TGAAGTTCAT CTQAAATGCA AGGTGCTTTC ATCAATGAAC CTTTTCAAAA 
CTTTTCrATG ATTGCAGAGA AGCTTTTTAT ATACCCAGC3V TAAC TTGGA A ACAGGTATCT 
GACCTATTCT TATTrAGTTA ACACAAGTGT GATTAATTTG ATTTCITTAA TTCCT TATTG 
AATCTTATGT GATATGATTT TCTOGATTTA CAGAACATTA GCACATGTAC CTTGTGCCTC 
CCATTCAAGT GAAGTTATAA TTTACACTGA GOGTTTCAAA ATTGQACTAG AAGTGGAGAT 
ATATTATTTA TTTATGCACT GTACTGTATT TTTATATTGC TGTTTAAAAC TTTTAAGCTG 
TGCCTCACTT ATTAAAGCAC AAAATGTTTT ACCTACTCCT TATTTAOSAC ACAATAAAAT 
AACATCAATA GATTTTTAGO CT6AATTAAT TTGAAAGCAG CAATTTGCTG TTCTCAAOCA 
TTCTTTCAA6 GCTTTTCATT 06ACACAATA AAATAAC31TC AATAG 

Seq ID KO: 435 Protein sequence 
Protein Accession fti MP_000484.2 

1 11 21 31 41 51 

I I I I- I I 

MLPQIPFLLL VSIJJLVHGVF YAERYQMPTXS IKGPIiPNTKT QFFIPVTIKS KGIAVRGBQG 60 

TPGPPGPAGP RGHPGPSGPP GKPGYGSPGL QGEPGLPGPP GPSAVGKPGV PG LPGK PGER 120 

GpyGPRC23VG PA6LPGPRGP PGPPGXPGPA GISVPGICPGQ QGPTGAPGPR GPP GgG APG 180 

VPGMNGQKGE i4GYGAP6RPG ERGIiPGFQGP T6PS6PP6V6 KRCTIGVFGQ PGIXODRGFP 240 

GEMGPIGPPG PQGPP6BR6P B6IGKFGAAG APGQPGIPGT KGLP6APGIA 6PP6PPGF6K 300 

PGLPGLKGER GPAGLPGGPG AKGEQGPAGL PGKPGLtGPP GNMGPQGPKG IPGSHGLPGP 360 

KGETGPAGPA GYPGAKGERG SPGSDGKPGY PGKPGIiDGPK GNPGLPGPKG DPGVGGPPGL 420 

PGPVGPAGAK Q4P(SNGEAG PRGAP6IPGT RGPIGPFGIP GFPGSR6DPG SPGPPGPAGI 480 

ATKGLHGPTG PPGPPGPRG8 SGEFGIiPGPP GPPGPPGQAV NPB6FIKA8Q RPSLSQTPLV 540 

SANQGVTGMP VSAFTVZLSK AYPAIGTPIP EDKILYMROQ BYDPRTOIFT GQIPGIVYFS 600 

YBVBVKGTBV WVGLYKNGTP VMyTYDEyTR GYLDQASGSA IZDLTENDQIV HLGfLFNABSN 660 
GLYSSEYVHS SF6GFLVAPN 

Seq ID NO: 436 Dt2A sequence 
Nucleic Acid Accession fi: XM_062811 
Coding sequence: l..e88 

1 11 21 31 41 51 

I I I I I I 

ATGTGGGGOS CTOGCOOCTC GTCOGTCTCC TCATCCTGGA AOSOOGCTTC OCTCCTGCAG 60 

CTGCTGCTGG CTGOGCTGCT GGOGGCGGGG GCGAGGGCCA GOGGOGAOTA CTGCCACGGC 120 

TGGCTGGACG OGCAGGGCGT CTGGCGCATC GGCTTCCAGT GTCCOGAGCG CTTCGACGGC 180 

OOCX5ACX3CCA CCATCTGCTG CQGCAOCTGC GCGTTGC6CT ACT6CTOCTC CAGC 600G AG 240 

GC3GCGCCTGG ACCAGOGCGG CTGCGACAAT GACOGCCAGC AQGG OGCTO G OGAOOTTO QC 300 

OGGGOGGACA AAGAOQQCCC CGAOGGCTCG GCAGTGCCCA TCTAOSTGCC GT TCCTC ATT 360 

GTTGGCTCOO TOTTTGTOOC CTTTATCATC TTGG6GTCCC TOQTGQCAGC CTGTTGCTGC 420 

AGATCTCTCC GGCCTAAGCA GGATCCCCAG CAGAGCCQAG CCCCAGGGGG TAACCGCTTG 480 

ATGGAGACCa TCCCCATGAT CCCCAGTGCC AGCACXTCCC GGGGGTOGTC CT CAOGC CAG 540 

TCCAGCACAG CTGCCAGTTC CAGCTCCAOC CCCAACTCAG GGGCCOGGGC GCCCCCAACA 600 

AOGXCACAGA CCAACTGTTQ CTTQCOGGAA GGGACCATGA ACAAOGTGTA TGTCAACAT6 660 

CCCAOGAATT TCTCTGTGCT GAACTGTCAG CA6GCCACCC AGATTGTGCC ACATCAAGQG 720 

CAGTATCTOC ATCCCCCATA C3GTGGGGTAC AOGGTGCAGC ACGACTCTGT GCCX3VTGACA 780 

GCIGTGCCAC CTTTCATGGA CSGCCTGCAG OCTQOCTACA QGCAGATTCA GTCCCCCTTC 840 
CCTGACAOCA ACAGTGAACA GAAGATGTAC CCA609GTGA CTGTATAA 

Seq ID HOt 437 Protein sequence 
Protein Accession #t XP_0628ll 

1 11 21 31 41 51 

I 1 I I I I 

MHGARRSSVS SSWNAASIiLQ LLLAALZAAG ARASGEYCHG WL0AQ6VWRI GFQCPERFDG 60 

GDATIC06SC AIiRYCCSSAB ARLDQGGCDN DRQQGAGEPG RADKDGPDGS AVPIYVPFLI 120 

VGSVPVAFII liGSLVAACCC RCLRPKQDPQ QSRAPGGHRL METIPMIPSA STSRGSSSRQ 180 

SSTAASSSSS ANSGARAPPT RSQTNCXiPB GTMNNVYVNM PINPSVLNOQ QATQIVPHQG 240 
QYLHPPYVGY TVQHDSVPMT AVPPPMDGLQ PGYRQIQSPF PR7NSEQKKY PAVTV 

Seq ID HO: 438 DNA sequence 

Nucleic Acid Accession «t NM_004004.1 

Coding sequence: 1..681 

1 11 21 31 41 51 

1111)1 

ATGGATTGGG GCAOGCTGCA GAOGATCCTG GGGGGTGTBA ACAAACACTC CACCAGCATT 60 

OGAAAGATCT GGCTCACCGT OCTCTTCATT TTTCGCATTA TGATCCTOGT TGTGGCTGCA 120 

AAGGAGGTGT GGGGAGATGA GCAGGCOSAC TTTGTCTGCA ACACCCTGCA GCCAGGCT6C 180 
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AA6AA0GTGT GCTAOGATCH CTACTTCOCC ATCTCCCACA, TCCGGCTAT6 G6C0CIGCA0 240 

CTGATCrrOG TGTCCAGCCC AGCGCTCCTA GTG6CCATGC AOGTGGCCTA CCBGAGACAT 300 

GAGAAGAAGA 66AAGTTCAT CAAGGGGGAG ATAAAGAGTG AATTTAAGGA CAT06AGGAG 360 

ATCAAAACCC AGAASGTCOG CATCGAAGGC TCCCTGTGGT GGACXTTACAC AAGCAGCATC 420 

TTCTTCCGGG TCATCTTCGA AGCCGCCTTC ATGTACGTCI TCTATGTCAT GTAOGAQGGC 480 

TTCTCCATCC AOOGGCTGGT GAAGTGC3UVC GCCTGGOCTT GTCCCAACAC T CTGGACTG C 540 

■m X jIl t 'l'CLX; G60CCA0GGA GAAGftCTGTC TTCACAGTGT TCATGATTGC AGTOTCIOGA 600 

ATTTGCATCC TCCTGAATGT CACIGAATtG TGTTJCmGC TAATTASftTA I W iTCTGGG 660 
AAGTCAAAAA AGCCftGTTTA A 

Seq ID KO: 439 Protein sequence 
Protein Accession i: NP.003995.1 

1 11 21 31 41 51 

I I I I I I 

MDWGTLQTIL GGVNKHSTSI GKIWLTVIiPI FRIMILWAA KEVW6DBQAD PV OTTm PGC 60 
KKVCYDHY7P ISHIRIjMALQ LIFVSSPALL VAMRVAYRBH EKKRKFIRQB 2KSSFKDIEE 120 
IKTQKVRIBO SLKWrYTSSI PPRVIPEAAP MyVPYVMVDO PSMQRIiVKCM AHPCPUTVDC 180 
FVSRPTEKTV FTVPMIAVSO ICIIiLMVTBI* CYLLIRYCS6 XSKXBV 



Seq ID MOt 440 DNA sequence 

Nucleic Acid Accession S: XM_061091.1 

Coding aequencet 1..24B1 

I U 21 31 41 51 

I I.I ,1.1. I 

ATGOCAAATA CTTCAOGAAC AAOCAGGATT GAAATTTGGC TTCTCCAAGA GCCGCCCGGG 6d 

CACGGAGGGC TGGT08C0GC *rCrOdriXXS a GTGAGTCCCA GCOCaSAGTT OGCTCTGGOG 120 

CCCGGOTACC OSCCAGTGCC GGCT6C0GAT GACCGATTCA CGCTCCCGAT GATTGGAGGT 180 

CAGATOCATG 6TGAGAAGGT AGATCTCTGG AGCCTTQGTQ TTCTTTGCTA TGAATTTTTA 240 

GTTGGGAAOC CTCCTTTTGA G6CAAACGAA GTCCATGTAA GCAAAGAAAC CATCGGGAAG 300 

ATTTCAGCTG CCABCAAAAT OATOTGGTGC TCGGCTQCRO TGGACATCAT GTTTCTGTTA 360 

GATGGGTCTA ACAOOOTCGG GAAAGGGAGC TTTOAAAOGT 0CAA6CACTT TGOOTCACA 420 

GTCTGTGAOG GTCTGGACAT CAGCCCCGAG AGGGTCA3A0 TGGGAGCATT CCAGTTCAGT 4B0 

TCCACTCCTC ATCTGGAATT CCCCTTGGAT TCATTTTCAA CCCAACAGGA AGTGAAGGCA 540 

AGAATOVAGA 6GATGGTTTT CAAAGGA6GG G6CAC6GAGA CGGAACTTGC TCTGAAATAC 600 

CTTCTGCACA 0AQGGTT6CC TGGAG6CAGA AATGCTTCTG TGOOCCAGKI CCTCATCATC 660 

GTCACZGATG GQAAGTCCXA QGGGGATGTO 6CACT6CCAT CCMVGCAOCT GAMXaAAOG 720 

GGTCTCACTO TGTTTGCTGT GGGGGTCAGG TTTCCCAGGT GGGAGGA6CT GCATGCACTG 780 

GCCAGCGAGC CTAGAGGGCA GCACGTGCTG rTGGCTGAGC AGOTOGAGGA TGCCACCAAC 840 

GOCCTCTTCA GCA(XCTCAG CAGCTOGGCC ATCTGCTOCA GOGCCACGCC AGCTGGGAGC 900 

0CC6A6CTT6 TCTTCATGGA 6C0GTTAAT0 GGCATCTCTC TCATAGGCCC CTGTGACTaS 960 

CAGCCCTGCC ASAATGGAGG CACATGTGTT CCASAA66AC TG6ACG6CTA CCA6T6CCTC 1020 

TGCCCX5CTGG CCTTTGGACG GGAGGCTAAC TGTGCCCTGA AGCTGAGCCT GGAATGCAGO 1080 

GTOGACCrCC TCTTCCTGCT GGACAGCTCT GOGGGCACCA CTCTGGAaSO CTTCCTGCGG 1140 

GCCAAA6TCT TOGTGAAGOQ GTTTGTGOGG GCCGTGCTGA GOGAGGACTC TOGGGCCCGA 1200 

OTGGGTGTGG CCACATACAG GAGGGAOCTG CTGOTGGOGG TGCCTGTQGG CGAGI AOCAG 1260 

GATGTGCCTG ACCTGGTCTG 6A0CCT0QAT GGCATTCCCT TCOyTGGTGG COCCACCCTG 1320 

ACGGGCAGTG CCTTGGOGCA GGCGGCA6AG CXJTGGCTTCG GGAGCGCCAC CAGGACAGGC 1380 

CAGGACCGGC CACGTAGAGT GGTGGTTTTG CTCACTGAGT CACACTCOGA GGATGAGGTT 1440 

G0G6GCCCAG 060G7CA08C AA6GG06GGA GAGCTGCTCC TGCTGGGTGT AGGCAGTGAG 1500 

GCOQTOOGQG CA6A6CT0GA 0GA6ATCACA GGCAOOCCM AGCA76TGAT GGTCTACTOQ 1560 

GATCCTCAGG A TCTGTTCAA CCAAATCOCT GAGCTGCAGG G6AAGCTGTX3 CAGCCGGCAO 1620 

OGGOCAGGGT 6C0GQACACA AGCCCTGGAC CTOGTCTTCA TGTTGGACAC CTCTGC CTCA 1680 

GTAGGGCCOG AGAATTTTGC TCAGATGCAG AGCTTTGTGA GAAGCTGTOC 0CTCCA6TTT 1740 

GAGQTQAACC CTGAOGTGAC ACAGGTCOGC CTGGTGGTCT ATGGCAGOCA GGTGCAGACT 1800 

GCCTTCGGOC TOGACACCAA ACOCACCOGG GCTQCQATGC TGOOGGOCAT TAGCCRGGCC 1860 

CCCTACCTAG GTGGGGTGGG CTCAGCCGGC AOOGCCCTOC TSCACATCTA TQACAAAGTO 1920 

ATGACCGTCC AGAGGGGTGC CCGOCCTGGT OTCCCCAAAG CTGTGGTGGT GCTCACAGGC 1980 

GGGAGAGGCG CAGAGGATGC AGCOGTTCCT GCCCAGAAGC TGAGGAACAA TGGCATCTCT 2040 

GTCTTGGTOG TGQQOSTGGG GCCTGTCCTA AGTGAGGGTC TGCGGAGGCT TGCA6GTCCC 2100 

CGGGATTCCC TOATOCACGT GGCAGCTTAC QCXSQACCItSC OOTACCAOCA GGA CGTGCT C 2160 

ATTGAGTGGC TG TO TGGAGA AGCCAAGCAG CCAOTCAACC TCT6CAAACC CAGCC0GT6C 2220 

ATGAATGAGG GCAGCTGCGT CCTOCAGAAT GGGAGCTACC GCTGCAAGrG TCGGGATGGC 2280 

TGGGAGGGCC CCCACTGCGA GAACOGTGAG TGGAGCTCTT GCTCTGTATO TGTGAGCCAG 2340 

GGATOOATTC TTGAGACGCC CCTGAGGCAC ATGGCTCCOS TQCAGGAGGO CAGOiOCCGT 2400 

ACCCCTCCCA 6CAACTACAG A6AAGGCCTG GGCACTGAAA TGOTGCCTAC CTTCTGGAAT 2460 
GTCTGTGCCC CAGGTCCTTA G 

Seq ID NO I 441 Protein sequence 
Protein Accession I: XP_06l09l.l 

1 11 21 31 41 51 

I I I I I 1 

MPNTSGTTRI EIWLLQEPPG HRALVAALLP VSPSPELAIA PGYPPVPAAD DRPTLPMIGG 60 

<^GEKVDLH ' SLGVIiCYEFL VGKPPFEANE VEVSKE7ZGR ISAASKMMHC SAAVDIMFLL 120 

DGSNSVGKGS PERSKHPAIT VCDGLDISPE RVRVGAPQPS STFBLSFPXA SFSTQQEVKA 160 

RIKRMVPKQO RTETELALKY UfiRGLPGGR KASVPQILII VTDGFBQGDV ALPSKQLKER 240 

GVTVPAVBVR FPRHEEIfiAL ASBPRGQHVIj LAEQVEDATN GLFSTLSSSA ICSSATPAGS 300 

PELVFMERIM GISItlGPCDS QPCQNGGTCV PEGIiJGYQCL CPIAFGGEAN CALKLSLEOl 360 

VDLLFLLDSS AGTTLDGPLR AKVFVKRPVR AVLSEDSRAR VGVATYSREL LVAVPVGEYQ 420 

DVPDLVWSLD GIPPRQGPTL TGSALRQAAB RGPGSATRTG QDRPRRVWL LTBSHSEDBV 480 

AGPARHARAR ELLLLSVGSE AVRAELEBIT GSPKHVMVYS DPQDLFNQIP EU3GKLCSRQ 540 

RPGOtTQAU) IiVFMLDTSAS V6PEKFAQKQ SPVRSCAliQP EVNPDVTQVG LWYGSQVQT 600 

AF6LDTKPT& AAMLRAI8QA pyLQQVGSAG TALI^IYDEV MTVQRGARPG VPKAVWLTG 660 

GRGAEDAAVP AQRUtHMGIS VLWGVGPVIi SEdtRSLAGP RDSUHVAAY ADLRYHQDfVL 720 
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lEWIiOGSAKQ FVHLCKPSPC MNBGSCVLGK GSYRCKCHDG HBSPRCEHBB WSSCSVCVSQ 780 
GWZIiETPUtH MAPVQE6SSR TPPSHYRECL GTStVPTFHN VCAFGP 

Seq ZD NO I 442 DHA sequence 

Nucleic Acid Accession ft: Bos sequence 

Coding sequence: 1..3424 

1 11 21 31 41 51 

I i I i I I 

ATGCCCCCTT TCCTOTTGCT GGAGGCOGTC TGTGTTTTCC T6TTTTCCAG AGTGCCCCCA 60 

TCTCTCOCTC TGCAGGMU3T C C ATGT A ACC AAAGAAACCA TOGGGAASAT TT CA0CT 60C X20 

AGCAAAATGA TGTGGT6CTC G6CTGCAGTG GACATCATGT TTCT6TTAGA TGG6TCTAAC 180 

AGCGTOGGGA AAGGGA6CTT TGAAAG6T0C AAGCACTTTG CCATCACAGT CTGTGAOGGT 240 

CTGGACATCA GCCCCGAGAG GGTCAGAGTO OGAGCATTCC AGTTCACTTC CACTCCTCAT 300 

CTGGAATTCC CCTTGGATTC ATTTTCAACC CAACAGGAAG TGAA6GCAA6 AATCAAGAGG 360 

AT6GTTTTCA AA0GAGGGC6 CACGGAGAOG GAACTTGCTC TGAAATACXrT TCIGCACAGA 420 

GGGTTGCCTG GAGGCAGAAA TGCTTCTGTG CCCCAGATCC TCATCATCGT GACTGATGGO 460 

AAGTCCCAGG GG6ATGTGGC ACTGCCATCC AAGCAGCTGA AGGAAAG6GG TGTCACTGT6 540 

TTTGCTGTGG GGGTCAGGTT TCXICAGGTGG GAGGAGCTGC ATGCACTGGC CAGCXSAGCCT 600 

A6AGGGCAGC AOGT G CTGTT GGCTGAGCAO GTGGAGGA1X3 CCACCAACX36 CCTCTTCAGC 660 

ACCCTCAGCA GCTOGGCCAT CTGCTCCAGC GCCACX30CAG ACTGCAGGGT CGAGGCTCAC 720 

OCCTCTGAGC ACAGGAOGCT GGAGATGGTC OGGGAGTTOO CTG6CAATGC CXXa^TGCTGG 780 

AGAGGATOSC G600GACCCT TGOGGTGCTG GCTGCACACT GTCCCTTCTA CA6CT6GAA6 840 

A6AGTGTTCC TAACCCACCC TQCCACCT6C TACAGGACCA CCTGCCCA60 CCCCTGTGAC 900 

T06CAGCCCT GCCAGAATGG AGGCACATGT GTTCCAGAAO GACTGGACGG CTACCAGTGC 960 

CTCTGCCCGC TGGOCnTGG AGGGGAGGCT AACTGTGCXX TGAAGCTGAG OCTGGAATCC 1020 

AGGGTOGACC TCCTCTTCCT QCTGGACAGC TCTGCGGGCA CCACTCTGGA 0G6CTTCCT6 1080 

C6GGCCAAAG TCrTOGTGAA G06GTTTGIG CGGGOOGTGC TGA6CGAG6A CTCTOGGGCC 1140 

03AOTGGGTG TGGCCACATA CAGCAGGGAG CitiCl X aUTW OGGTGCCTGT GGGGGAGTAC 1200 

CAGGATGTGC CTGACCT6GT CTGGAGCCTC GATGGCaTTC OCTTCCGTCO TGGCCCCACC 1260 

CTGACGGGCA GTGCCTTGCG GCAGGOGGCA GAGCGTGGCT TCGGGAGOSC CACCAGGACA 1320 

GGCCAGGACC 6GCCA0QTA6 AGT66TGGTT TTGCTCACTG AGTCACACTC OGAGGATGAG 1380 

6TTG06G6CC CAGOGOGTCA OGCAAGG60G 06AGA6CT6C TCCTGCTGOG TGTAOGCAGT 1440 

QAGGCOGTQC GGGCAGAGCT GGAGGAGATC ACA6GCAG0C CAAAGCATGT (SITGGTCTAC 1500 

TOGGATCCTC AGGATCTGTT CAACCAAATC CCTGAGCTCC AGGGGAAOCT GTGCAGCCGG 1560 

CAGCGGCCAG GGTGCC3GGAC ACAAGCCCTG GACCTOGTCT TCATGTTGGA CACCTCTGCC 1620 

TCAGTA6GGC COSAGAATTT TGCTCAGATG CAGAGCTTTG TGAGAAGCTG TGCCCTCCAG 1680 

TTTGAOGTGA AOCCTGAOGT GACACAGGTC GG C XTOSTG G TGTAT6GCA6 CCAGGTGCAG 1740 

ACTGCCTTC6 G6CTGGACAC CAAACCCACC OGGGCTGOGA TGCTG06GGC CATTAGCCAG 1600 

GCCCCCTACC TAGGTG6GGT GGGCTCAGCC GGCACCGOCC TGCT6CACAT CTATGACAAA 1860 

GTGATGACOG TCCAGAGGGG TGCCCGGCCT GGTGTCCCCA AAGCTGTGGT GGTGCTCACA 1920 

GG0GGGAGA6 G06CAGAGGA TGCAGCOGTT CCTGCCX3U3A A6CTGAGGAA CAATGGCATC 1980 

TCT6TCTTG6 TOGTQOGCQT OGGGCCItSTC CTAAGT6AGG GTCT6C6GAG GCTTQCAOGT 2040 

00CC6GGATT CCCTGATCCA OGTGGCAGCT TA06CGQA0C T60GGTACCA OCAQGAOQTO 2100 

CTCATTGAGT GGCTGTGTGG AGAAGCCAAG CAGCCAGTCA ACCTCTGCAA ACCCAGCCCG 2160 

TGCATGAATG AGGGCAGCTG CGTCCTGCAG AATGGGAGCT ACCGCTGCAA GTGTCGGGAT 2220 

GGCTGGGAG6 GCCCCCACTG CGAGAACG6T GA0TGGA6CT CTTGCTCTGT ATGTGTGAGC 2280 

CA6GGATG6A TTCTTOAOAC GC0CCTGAG6 CACATGGCTC 008TGCAGGA GGQCA6CAGC 2340 

GGTACCXXrrC CCAGGAACTA CAGAGAAGGC CTGGGCACTG AAATGGTGOC TACCTTCTGG 2400 
AATGTCTGTG CCCCAGGTCC TTAG 

Seq ID NO: 443 Protein sequence 
Protein Accession 8 s Eos sequence 

1 11 21 31 41 51 

I I I I I I 

KPPFLLIiEAV CVFLFSRVPF SLPLQEVHVS KETIGKISAA SKMHHCSAAV DIMFLIiDGSN 60 

SVGRSSFERS KHPAITVCDG LDISFERVRV 6AFQFSSTPH IiBFPU>SFST QQEVKARIKR 120 

HVFKGGRTET ELALKYLLHR GLPOGRNASV PQXLIZVTDO KSQ6DVALPS KQLKERGVTV 180 

FAVGVRFPRH EBLBALASEP RGQHVIiLABQ VEDAIVGLFS TLSSSAICSS ATPDCRVEAH 240 

PCEHRTLENV REFAG^IAPCM RGSRRTLAVL AARCPFYSWK RVFLTHPATC YRTTCPCPO) 300 

SQPOQNGGTC VFEGLDGYQC LCPLAFGGEA NCALKLSLEC RVDIiIiFLU)S SAGTTLDGFL 360 

SAXVPVKRPV RAVLSEDSRA RVGVATYSRB LLVAVPVGEy QDVPDLVHSL DGZFFRGGPT 420 

LT6SALRQAA ERGFGSATRT GQDRFRRVW LLTBSBSBJB VAGPARBARA RELIiLIiGVGS 480 

EAVRAELEBI TGSPKHVMVY SDPQDLFNOI PELQGKLCSR QRPGCRTQAL DLVFMLDTSA 540 

SVQPEI7FAQM QSFVRSCALQ FEVNPDVTOV GLWYGSQVQ TAFGUmCPT RAAMLRAISQ 600 

APYIiGGVGSA GTALLBZYDK VMTVQRGARP GVPKAVWLT GCSIGASDAAV PAQKLRKNOZ 660 

6VLW6V6FV LSB6LRRIA0 PRDSLZBVAA YADLRYHQDV LIBWL08BAK QPVHX.CKPSP 720 

CMNB68CVLQ NGSYRCRCRD GWBGPHCBIR BHSSCSVCVS QGWILETPLR HNAPVQE6S8 780 
RTPPSNYREG LGTEKVPTPW NVCAPGP 

Seq ZD NO: 444 DNA sequence 

nucleic Acid Accession fts Eos sequence 

Coding sequence: 89.. 3356 

1 11 21 31 41 51 

I I I I I I 

GCCGCCXG6C COGAGCCGOS CCCGGGTCTO TOAGTAQMIC 06CCCSGGCA CC6AGCGCTG 60 

GTCGCCGCTC TOCTTCCGTT ATATCAACAT OC C CC C TTTC CitSTllSClSSG AA6CCGTCTG 120 

• IXSl TlT C CnJ TTTTCCAGAG T6CCCCCATC TCTCOCTCTC CAGGAAGTCC ATGTAAGCAA 180 

AGAAACCATC G6GAAGATTT CAGCTGCCA6 CAAAATGATG TGGTGCTCGG CTGCAGTGGA 240 

CATCATGTTT CTGTTAQATG GGTCTAACAG 06TGGGGAAA GGGA6CTTTG AAAGGTCCAA 300 

GCACriTGCC ATCACAQTCT GTGAC6GTCT GGACATCA6C OOOQAGAGGG TCAGAGTGGQ 360 

AGCATTCCA6 TTCA6TTCCA CTCCTCATCT G6AATTCCCC TTOGATTCAT TTTCAACCCA 420 

ACAGGAAGTG AA66CAAGAA TCAAGAGGAT GGTTTTCAAA GGAGGGCGCA 0GGAGACG6A 480 

ACTTGCTCTQ AAATACCTTC TGCACAGAGG GTTGCCTGGA GGCAGAAATG CTTCTGTGCC 540 

CCAGATCCTC ATCATCGTCA CT6ATGGGAA GTCCCAGGG6 GAT6T6GCAC TGOCATOCAA 600 
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GCA6CTGAA6 GAAAGGG6TG TCaCTgEGTT T6CIOTGGG6 GTGAGGTTTC 0CAGGTGC5GA 660 

GGAGCTGCHT GOICTGGCCA G06A6CCTA5 AGGGCAGCAC GTOCi'grfGG CTGAGCAGGT 720 

G6AGGATGCC ACCAAOGGCC TCTTCAGCAC CCTCAGCA6C TOGGOCATCT GCTCC3VG0GC 780 

CACGCCAGAC T6CAGG6TOG AGGCTCACCC CTGTGAGCAC AGGAOSCTOG AGATGGTCOO 840 

GGA G TTOGCT G6CAATGCCC CATGCT6GAG AGGATOGOGG OGGACCCTTG OG G TGCTGGC 900 

TGCACACrGT CCCTTCTACA 6CIGQAAGAG AGTGTTCCCA ACCXAOCCTG OCAOCTGCTA 960 

CAG6ACCAOC TGCCCA6GCC OCTGTQACTC GCAG0CCT6C CAGAATGGA6 6CACATGTGT 1020 

TCCAGAAGGA CTGGACGGCT ACCAGTGCCT CTGCCOGCTG GCCTTTGGA6 GGGAGGCTAA lOBO 

CTGTGCCCTG AAGCTGAGCC TXXSAATGCAG GGTOGACCTC CTCTTCCTGC TGGACAGCTC 1140 

T60GGGCACC ACTCTGGAOG GCTTOCTGOG 66CCAAAGTC TTCGTGAAGC GGTTTGTGOG 1200 

GGCC C TGCTG AG06AG6ACT CTCOOQCCCQ A GT GGGTGTO GCCACATACA GCAG6GAGCT 1260 

6CTGGTG60G GTGGCTGTOO GG6AGTACCA GGATGTGOCT GACCTG G TCT 6GAGCCT0GA 1320 

TG6CATTCCC TTCCGTGGTG GCCCCACOCT GACGGGC3W3T OCCTTGCGGC AGGOGGCAGA 1380 

GCGTGGCTTC GGGAGCGCCA CCAGGACAGG CCAGGACOOG CCACGTAGAG TGGTGGTTTT 1440 

GCTCACTGAG TCACACTCG6 AGGATGAGGT TGCGGGCCCA 6C3GOGTCACG CAAGGGCGCG 1500 

AGAGCTGCTC CT6CT06GT6 TAGGCAGTQA G60CGTG000 GCA6AGCTGG AGGAiQATCAC 1560 

AG6CAGCCCA AAGCATGTGA TG6TCTACTC GGATCCTCAG GATCTGTTCA ACCAAATCCC 1620 

TGAGCTGCAO GGGAAGCTGT GCAGCCGGCA OCGGCCAGGG TGCCG6ACAC AAGCCCTGGA 1680 

CCTCGTCTTC ATGTTGGACA CCTCTGCCTC AG7AGGGCCC GAGAATTTTG CTCAGATGCA 1740 

GAGCTTTGTO AGAAGCTGTG CCCTCCAG7T TGAGGTGAAC CCTGAOGTGA CACAGGTCGG IBOO 

CCTGGT GG TG TATGGCA6CC AGGTGCAGAC TGCCTTOGGG CTGGACACCA AACCCACCCG 1860 

GGCTGC6ATG CTGOGGGCCA TTAGCCAG6C CCCCTACCTA GGTGGGGTGO GCTCAGCCGQ 1920 

CACC6CCCTG CTGCACATCT ATGACAAAGT GATGAOOGTC CAGAGGGGTG CCCXSGCXTrGG 1980 

TGTCCCCAAA GCTGTQGTGG TGCTCACAGG CGGGAGAGGC GCAGAGGATG CAGCCGTTCC 2040 

T6CCCAGAAG CTGAGGAACA ATGGCATCTC TGTCTTGGTC GTGGGOGTGG GGCCTGTCCT ZlOO 

AAGTGAGGGT CXGCGGAGGC TTGCAGGTCC CCGGGATTCC CTGATCCAOG TGGCAGCTTA 2160 

CGC06ACCTG OGGTACCACC AGGACCIGCT CATTGAGTGG CTGTGTGGAQ AA6CCAAGCA 2220 

GCCAGTCAAC CTCTGCAAAC CCAGCCOGTG CATGAATGAG GGCAGCTGCG 'TCCTGCAGAA 2280 

TGOGAGCTAC CX3CTGCAAGT GTCGGGATGG CTGGGAGGGC CCCCACTGOG A6AACCGATT 2340 

CTTGAGACGC CCCTGAGGCA CATGGCTCCC GTGCAGGAGQ GCAGCAGCOS TACCCXTTCCC 2400 

AGCAACTACA GAGAAGGCCT GGGCACTGAA ATGGTGCCTA CCTTCTGGAA TGTCTGTGCC 2460 

CCAGGTCCTT AGAATGTCTG CTTCCOGCCG TGGCCAGGAC CACTATTCTC ACTGAGGGAG 2520 

GAGQATGTCC CAAC7GCA6C CAT6CTGCTT A6ASACAAGA AA6CAGCT6A T6TCACXXAC 2580 

AAAGGATGTT GTTGAAAA6T TTTGATGTGT AA6TAAATAC CCACTTTCTG TACCTGCTGT 2640 

GCCTTGTTGA GGCTATGTCA TCTGCCACCT TTCCCTTGAG GATAAACAAG GGOTCCTGAA 2700 

GACTTAAATT TAGCGGCCTG ACGTTCCTTT GCACACAATC AATGCTOGCC AGAATGTTGT 2760 
TGACACA6TA ATGCCCAGCA GA66CCTTTA CTAGA6CATC CTTTGGACGG 

Seq ID HO: 445 Protein sequence 
Protein Accession fit Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MPFPLLLEAV CVFLFSRVPP 8LPLQSVHVS KETIGKISAA 8KMMWCSAAV OIMFLLDGSK 60 

SVQKGSFER5 KHFAITVCDG LOISPERVRV GAFQPSSTPH LSFPLDSFST QQEVKARIKR 120 

MVFK6GRTET ELALKYLLHR QLPGGRMASV PQILIIVTDG KSQGDVALPS KQLKERGVTV 180 

FAVGVRFFRH EELHALASEP RGQHVLLAEQ VEDATNGIiFS TLSSSAZCSS ATPDCRVEAH 240 

PCEBRTLEKV RSFAOKAPCH R68RRTXAVL AAKCPPYSHK RVFLTHPATC YRTTCPGPCD 300 

SQPOQNGGTC VPEGLDGYQC LCPLAFGGEA NCALKLSLBC BVDLLPLLDS SAGTTLD6PL 360 

RAKVFVKRFV RAVLSEDSRA RVGVATYSRE LLVAVPVGEY QDVFDLVWSL DGIPFRGGPT 420 

LTGSALRQAA ERGFGSATRT GQDRPRRVW LLTESHSEDS VAGPAREARA REZiLLLGVGS 480 

EAVRAELEEZ T6SPKHVMVY SDPQOZiFHQI PELQGKLCSR QRPGCRTQAZi DLVFMLOTSA 540 

8VQPSHFAQM Q8FVRSCALQ FEVNPDVTQV GLWYGSQVQ TAF6LDTKPT RAAMLRAISQ 600 

APYIiGGVGSA GTALLHZyDX VMTVQRGASP GVFKAVWLT G6RGAEDAAV PAQKLHtlNGZ 660 

SVIiWGVGPV LSBGbRRLAO PBDSLIHVAA YADLRYBQDV LIEHLOOEAK QFVHXiCKPSP 720 
CMMBGSCVLQ NGSYRCKCRD 6WE6PECE2IR FLRRP 

Seq ID mot 446 DMA sequence 

Nucleic Acid Accession 9: NM_031942.1 

Coding sequence: 145.. 1260 

1 11 21 31 41 51 

I I t I I I 

CC30GAGCCCC GCCOCTCCG G GCXrCGGGTOG 606CGCCCAG CCTGCCA6CC GCGCTQCTOC 60 

TGCTCCTCCT GCTGTGGGAC CGCTGACOGC G0G6CTGCTC OQCTCTCCCC GCTCCAAGCO 120 

COGATCTGGO CACCCGCCAC CAGCATGGAC GCTOGCOOOG TGCCGCAGAA AGATCTCAGA 180 

GTAAAGAAGA ACTTAAAGAA ATT CAGATA T 0TGAAGTT6A TTTCCATGGA AACCTOGTCA 240 

7CCTCTGAT6 ACAGTTGTGA CAGCTTTGCT TCTGATAA1T TTGCAAACAC GAG6CTGCAG 300 

TCAGTTOGGG AAOGCTGTAG GACCC6CA6C CA6T6CAG6C ACTCTGGACC TCTCAGGGTG 360 

60GATGAAGT TTCCAGCGCG GAGTAOCA6G GGAGCAACCA ACAAAAAAGC AGAGTCCCX3C 420 

CAGCCCTCAG A6AATTCTGT GACTGATTCC AACTCCGATT CAGAAGATGA AAGTGQAATO 480 

AATTTTTTGG A6AAAAGGGC TTTAAATATA AAGCAAAACA AAGCAATGCT T6CAAAACTC 540 

ATOTCTGAAT TAOAAAGCrr GCCTG6CT0Q TTGGGTQOAA GACATGCCCT COCAOGCTCC 600 

(»CTCACAAT CAAGGAGACC G0GAAG608T ACATTCXOGG GTGTTGCTTC CAGGAGAAAC 660 

CCTXSAACGGA 6AGCT0GTCC TCTTACCAGG TCAAGGTCCC GGATCCTOGG GTCCCTTGAC 720 

GCTCTACCCA TGGAGGAGGA GGAGGAAGAO GATAAGTACA TGTTG6TGAG AAAGAGGAAG 780 

AC06T66ATG GCTACATGAA TGAAGAT6AC CTGO CC AGAA GC0GT06CTC CAQATCATCC 640 

8T0ACCCITC OGCATATAAT TOGCCC3U3TQ GAAGAAATTA CAGAGGAGGA GTTOSAGAAC 900 

GTCTGCA6CA ATTCTOGAGA GAAGATATAT AACCGTTCAC TGG6CTCTAC TTGTCATCAA 960 

TGOCXn'CAGA AGACTATTGA TACCAAAACA AACTGCA6AA AOCCAGACTG CT6G6G06TT 1020 

OGAGGCCAGT TCTGTGGCCC CTGCCTTOGA AACCGTTATO GTGAAGAGGT CAGGGATGCT 1080 

CTGCTG G ATC 06AACTGGCA TTGCC060CT TGTGGAGGAA TCTGCAACTG CAGTTTCTGC 1140 

GQaCAGGGAO ATQGACXXntI TGOQACTQCXS Ql'O ei TGTQT ATTTAGCCAA ATATCATGGC 1200 

TTT6GGAAT0 TGCATGCCTA CTTGAAAAGC CTGAAACAGG AATTTGAAAT 6CAAGCATAA 1260 

TATCTGGAAA ATTTOCTGCC TGCCTTCTAC TTCTCAAATC TTTCTTGrAA AAGTTTOCAA 1320 

TTTTTTCACT GAAAOCTGAG TTAAAAATCT TGATGATCAG CCTGTTTCAT AAGAAACTCC 1380 

AATCAAGTTA ATCTTAGCAG ACATGTGTTT CTGGA6CATC ACAGAAG6TA TATTGCTA6T 1440 
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TACACTTTGC OCTOC TOCA G r r i tTfLTCl ' GCTCCCAACC OO OVTCTC AT AGCATC OOCC ISOO 

TCTATTTCCA ATGCTCCTCT CCAAC06CTT AGTTTCTGAA TTTCTTTTAA ATTACAGTTT 1560 

TATGAAAGCA TATTTTATTT ACTTGGTGTT GAAATAGCCC TCATAAAACC TAAGCaCTTG 1620 

QAAACACAAT AATAGTATTA ACTAACTAGA TCTATTQAAT TTCAGAGAAG AGCCTTCTAA 1680 

CTTGTTTACA CAAAAAOGAG TATGATTTAG CSUTTCATACT AGTTGAAATT TTTAATAGAA 1740 

TCAAGGCACA AAAGTCTTAA AAOGATGTGQ AAAAARJUSG TAATTATT6C AGATTGATGT 1800 

CTCTCAATCC CAT6TATTGC GCTTATGTTA CAAeTTGTTO TCACAGTTGA GACTTAATTT 1860 

CTCCTAATTT CTTCTGCCOS AAGGGTAAGT GGTGCGTCCA GCTTACAOGA TCATAATTCA 1920 

AAGGTTGGTG GGCAATGTAA TACTTAATTA AAATAATGAT GGAAGAGCTA TCTGGAGATT 1980 

ATGAGTAAGC TGA7TT6AAT TTTCAGTATA AAACTTTAGT ATAATTGTAG TTTGCAAAGT 2040 

TTAITTCAGT TCACATGTAA GGTATTGCAA ATAAATTCTT GOACAATTTT 8TATGGAAAC 2100 

TTGRTATTAA AAACTRGTCT b ' mcrri ' C ' n ' l GCAGTTTCIT GTAAATTTAT AAAC CAGGCA 2160 

CAAGGTTCAA GTTTAGATTT TAAGCACTTT TATAACAATO ATAAGTGCCT TmXJUAGAT 2220 

GTAACTTTTA GCAGTTTGTT AACCTGACAT CTCTGCCAGT CTACTTTCTG GGCAGGTTTC 2280 

CTGTGTCAGT ATTCCCCCTC CTCTTTGCAT TAATCAAGGT ATTTGGTAGA 6GTQGAATCT 2340 

AAGTGTTTGT ATGTCCAATT TACTT6CATA TGTAAACCAT T6CTGTG0CA TTCAAT GTTT 2400 

GATGCATAAT TGGACCTTGA AT06ATAAGT GTAAATACAO CTTTTGATCT GTAAIGCTTT 2460 
TATACAAAAG TTTATTTTAA TAATAAAATO TITGTTCrAA AAAAAAAAAA 



Seq ID HOt 447 Protein sequence 
Protein Accession #; NP_114148.1 



1 11 21 31 41 51 

I I i 1 1 1 

MDARRVPQKD bRVK3a31iKKP RYVKLISMET SSSSDDSC3)S FASDMFANTR LQSVRE6CRT 60 

RSQCRHSGPL RVAMKFPARS TRGATNKKAB SRQPSaJSVT DSNSDSEDES OSNFLEKRAL 120 

MIKQNKAMLA KLMSELESFP GSFRGRHPLP GSDSQSRRPR ERTFPGVA5R RNPBRRARPL 180 

TRSRSRILGS LDALCMEEEB ES)KVNLVRK RKTVDGYMNE DOLPRSRRSR SSVTLFHIIR 240 

PVEBITEEBL ENVCSNSREK XYMRSL6STC HQCRQKTIDT KTNCRNPDCH GVniGQPCGPC 300 

LRNRYGEEVR DALIiDFHWHC PFGRGICKCS FCRQRD6RCA TGVLVYLAia KGFGNVBAYL 360 
KSLKQEFEMQ A 



Seq ID NO: 448 DNA sequence 
Nucleic Acid Accession #: NM_019894 
Coding sequence: 1..1314 



I 11 21 31 41 51 

I I I I I I 

ATGTTACAGO ATOCTGACAO TGATCAACCT CTGAACAGCC TCGATGTCAA ACOCCTGOGC 60 

AAACCCOGTA TCCCCATGGA GACCTTCAGA AAGGTGGGGA TCCCCATCAT CATAGCACTA 120 

CTGAGCCTGG OGAGTATCAT CATTGTGGTT CTCCTCATCA AGGTGATTCT GGATAAATAC 180 

TACTTCCTCT 6CGGGCA0CC TCTOCACTTC ATCCCGAGGA AGCAGCTGTG TGACGGAGAG 240 

CIGGACTGTC OCTTGOGGGA GGAOGAGGAa CACIGTGTCA AGAOCTTCCC OQAAGGGGCT 300 

QCAIOTGGCAG TOCGCCTCTC CAAGGACOGA TCCACACT6C AQQT6CTGGA CTG8GCCACA 360 

GGGAACTGGT TCTCTGCCTG TTTCGACAAC TTOVCAGAAO CTCTCGCTGA GACAGCCTGT 420 

AGGCAGATGG GCTACAGCAG CAAACCCACT TTCaGAGCTG TGGAGATTGG CCXaCACCAG 480 

GATCTGGATG TTGTTGAAAT CACAGAAAAC A6CCAGGAGC TTOSCATGCG GAACTCAAGT 540 

GQGCCCTGTC TGTCAOQCTC CCTGCTCTOC CTGCACIGTC ITOOCTOI GQ QAAaA9CCTQ 600 

AAOACOCCGC GT6TG6TGGG TGGGGAG6A0 GCCTCT G TSQ ATrCTTOGCC TTGOCAGGTC 660 

AQCATCCAGT ACGACAAACA GCACGTCTGT GGAGGGAGCA TCCTGGACCC CCACTGGGTC 720 

CTCACXIGCAG CCCACTGCTT CAGGAAACAT ACCGATGTGT TCAACTGGAA GGTGOGGGCA 780 

G6CTCAGACA AACTGQQCAG CTTCCCATCC CTGGCTGTGG CCAAGATCAT CATCATTGAA 840 

TTCAACCOCA TGTACCOCAA AOACAAItSAC AT06CCCTCA TGAA6CTGCA GTTOCCACTC 900 

ACTTTCTCAO OCACAOTCAO GCCCATCTGT Ci'QCt.'Cr'rCl' TTGATGAQGA OCTCACTCCA 960 

GCCAOCCCAC TCTGGATCAT TGGATGGGGC TTTAC6AAGC AQAATGGAGG GAAGATGTCT 1020 

GACATACTGC TGCAGGOGTC AGTCCAGGTC ATTQACAGCA CACGGTGCAA TGCAGACGAT 1080 

GOGTACCAGG GG6AAGTCAC OGAGAAGATO AT6T6TGCAS GCATCCCGGA AGGGGGTGTG 1140 

GACACCTCCC A6GGT6ACAG T0GTGG3CCC CTQATGTAOC AATCTGACXA GTG6CATGT6 1200 

GTGGGCATOG TTAGCTGGGG CTATG6CTGC GGG6GCC0SA 6CACCCCA6G AGXATACACC 1260 
AAGGTCTCAO CCTATCTCAA CTGGATCTAC AATGTCTGGA AG6CTGAGCT QTAA 



Seq ID KO: 449 Protein sequence 
Protein Accession #i 1IP_063947.1 



1 11 21 31 41 51 

I 1 I i I { 

MLQDPDSDQP LNSLDVKPLR KPRIFMETFR XVGIPXIIAL LSLASIIIW VXiIKVILDKY 60 

ypXiCGQPLHF IPRRQIiOXSE LDCPUS^EB ECVKSFPE6P AVAVRLSKDR Sn^QVLDSAT 120 

GNHPSACFDN F7EALASTAC RQNGYSSKPT FRAVEIGPDQ DLDWEITSI SQELRMRNSS 180 

GPCLSGSLVS LHCLAOGKSL KTPRWGGEE ASVDSHPWQV SIQYDKQHVC GGSILDPSWV 240 

LTAAHCFRKH TDVFMWKVRA GSDKLGSFPS LAVAKIIIIE PMPMYPKDND lALMKLQPPL 300 

TFS6TVRPIC LPFFDEELTP ATPLMIIGWG FTKQNGG»(S DILLQASVQV IDSTRQIADD 360 

AYOGEVTERN NCAGIFE06V DT0(ya>S06P UIYQSDQMHV VGXVSKGYOC GGPSTPCVYT 420 



Seq ID NO: 450 DNA Sequence 

Nucleic Acid Accession #: XM_051860.2 

Coding sequences 52.. 3042 

1 11 21 31 41 51 

I I I I I i 

GCTCAOCCAG GAAAAATAT6 CAATOGTCCC ATTGATATAC AGGCCACTAC AA TGGA TGGA 60 

GTTAACCTCA GCACOGAGGT TGTCTACAAA AAAOGOCAQG AmtACOTT TG CTTGC TAC 120 

GAOOGGGGCA GAGCCTGCCG OAOCTACOQT GTAOGGTTCC TCTGTOGGAA GCCT GTOAOQ 180 

CCCAAACTCA CAGTCACCAT TGACACCAAT GTGAACAGCA CCATTCTGAA CTTGGAGGAT 240 

AATGTACA6T CATGGAAACC TGGAGATACC CTGGTCATTG CCAOTACTGA TTACTCCATG 300 

TACCAGGCA6 AA6AGTTCCA G6TGCTTCCC TGCAGATCCT G06CCCCCAA CCAGGTCAAA 360 
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GTGGCAGGCSA AACOUiTGTA OCTGCACATC GOGGAGGAGA TAGAOGGOGT GGACATGGGG 420 

606GAGGTTG GGCTTCTGAG C0G6AACATC AtAGTCSlTGG GG6A6ATGGA G6ACAAATGC 480 

TACCCCTACA GAAACCACAT CTGCAATTTC TTTGACTTCG ATACCTTTOG GGGCCACATC 540 

AAGTTTGCTC TGGGATTTAA G6CAGCACAC TTGGA3GGCA OGGAGCTGAA GCATATGGGA 600 

5 CAGCAGCTG6 TGGGTCAGTA CCCGATTCAC TTCCACCTGG COG GT GATGT AGAOGAAAGG 660 

GGA6GTTATG ACCCAGCCAC ATACATCAGG GACCTCTCCA TOCATCATAC ATTCTCTGGC 720 

TG(3GTCACAG TCCATQGCTC CAATGGCTT6 TTGATCAASG A06TTGTGGG CTATAACTCT 780 

TTGGGCCACT GCTTCTTCAC GGAAGATGGO CCGGAGGAAC GCAACACTTT TGACCACTGT 840 

CTTGGCCTCX: TTGTCAAGTC TGGAA(XCTC CTCCCCTOSO ACCOTGACAG CAAGATGTGC 900 

iU AAGATGATCA CA6GAGACTC CTACCCAG6G TACATCCCCA AGCCCAGGCA AGACTGCAA7 960 

GCT6TGTCCA CCTTCTGGAT OGCCAATOOC AACAACAACC TOVTCAACT6 TGOOGCTGCA 1020 

GGATCTGAGG AAACTGGATT TTGGTTTATT TTTCACCAG6 TACCAACGGG CCXXrTCOGTG 1060 

6GAATGTACT CCXXAGGTTA TTCAGAGCAC ATTCCACTGG GAAAATTCTA TAACAACCGA 1140 

GCACATTCCA ACTACCGGGC TGGCATGATC ATAGAOlAOG GAGTCAAAAC CACCGAGGCC 1200 

15 TCTGCCAAGG ACAAGCGGCC GTTCCTCTCA ATCATCTCTG CCAGATACA6 CCCTCACCAG 1260 

GACGCOGAOC OOCTGAAGCC G0GGGAGOC6 60CATCATCA GACACITCAT TGCCTACAAG 1320 

AACCAGGACC AOGGGGCCTG GCT606C60C GGGGATGTGT 6GCTGGACAG CTGC06GTTT 1380 

GCTGACAATG GCATTGGCCT GACCCPGGOC AGTGGTGGAA CCTTCCCGTA TGAOGACGGC 1440 

TCCAAGCAAG AGATAAAGAA CAGCTTGTTr GTTGGCGAGA GTGOCAACGT GGGGACGQAA 1500 

2U AT6ATGGACA ATAGGATCTG GGGCCCTGGC GGCTTGGACC ATAGOGGAAQ GACCCTCCCT 1560 

ATAOGCCAGA ATTTTCCAAT TAGAGGAATT CA6TTATATG ATG6CCCCAT CAACATCCAA 1620 

AACT6CACTT TCOGAAAGTT TGTGGCCXrTO GAGGGCCGGC ACACCAGOGC CCTGGCCTTC 1680 

CGCCTGAATA ATGCCTGGCA GAGCTGCCCC CATAACAACG TGACOGGCAT TGCCTTTGAG 1740 

GAOGTTCOSA TTACTTCCAQ AGTGTTCTTC GGAGAGCC7X5 GGCCCTGGTT CAACCAGCTG 1800 

25 GACATGGATG GGGATAAGAC ATCTGTGTTC CATGAOGTCG ACGGCTCOGT GTCOGAGTAC 1860 

CCTGGCTCCT ACCTCAOGAA GAATGACAAC TG6CTGGTCC GGCACCCAGA CTGCATCAAT 1920 

GTTCCOGACT GGAGAGGGGC CATTTGCAGT GGGTGCTATG CACAGATGTA CATTCAAGCC 1980 

TACAAGACCA GTAACCTGCG AATGAAGATC ATCAAGAATG ACTTCCCCAG CX»CCCTCTT 2040 

„ TACCTGGAGG GGGCGCTCAC CAGGAGCACC CATTACCAGC AATAOCAACX GGTTGTCACC 2100 

3U CTGCAGAAGO GCTACACCAT CCACTGGGAC CAGAOQGCCC COGCOGAACT CGCCATCTGG 2160 

CTCATCAACT TCAACAAGGG OGACTQGATC CGAGTGGGGC TCTGCTACCC GCGAGGCACC 2220 

ACATTCTCCA TCCTCTOGGA TGTTCACAAT CGCCTGCTGA AGCAAACX5TC CAAGAOGGGC 2280 

GTCTTOGTGA G6A0CTT6CA GATGGACAAA GTt3GAGCAGA GCTACOCTGG CftGGAGCCAC 2340 

TACTACTGGG A06A6GACTC A6GGCT6TTG TTCCTQAAGC TGAAAGCTCA 6AACQAGAGA 2400 

35 GAGAAGTTTG CTTTCTGCTC CATGAAAGGC TGTGAGAGQA TAAAGATTAA AGCTCrGATT 2460 

CCAAAGAAOG CAGGCGTCAG 76ACTGCACA GCCACAQCTT ACCCCAAGTT CACCGAGAGG 2S20 

GCTGTOGTA6 A06TG00GAT GCCXZAAQAAG CTCTTTOGTT CTCAQCTGAA AACAAAGGAC 2580 

CATTTCTTGG AGGTGAAGAT 6GAGAGTTCC AA6CAGCACT TCTrCCACCT CTGGAAGGAC 2640 

TTOGCTTACA TTGAAGTGGA TGQ6AAGAAG TACOOCAGTT C3GGAGGATG6 CATCCAGGTG 2700 

4U GTGGTGATTG A0GG6AACCA AGGGCGOGTG GTGAGOCACA OGAGCTTCAG GAACTCCATT 2760 

CTGCAAGGCA TACCATGGCA GCTTTTCAAC TATGTGGCGA CCATCCCTGA CAATTCCATA 2820 

6TGCTTAT6G CATCAAAGGG AAQATACGTC TCCAGAGGCC CAT66ACCA6 AGTGCTGOAA 2880 

AAGCTTQGGG GAGACAGOGG TCTCAA6TTG AAAGAGCAAA T06CATT0GT TGGCTTCAAA 2940 

GGCAGCTTCC GGCCCATCTG GGTGACACTO GACACTGAG6 ATCACAAAGC CAAAATCTTC 3000 

45 CAAGTTGTGC CCATCCCTGT GGTGAAGAAG AAGAAGTTGT GAGGACAGCT GCCGCCCGGT 3060 

GCCACCTOST GGTAGACTAT GACGGTGACT CTT6GCAGCA GACCAGTGGG GGATGGCTG6 3120 

QTCCCCCAGC CCCTGCCA6C AGCTGCCTGO GAAGGCOSTG TTTCAGCCCT GAT6GGCCAA 3180 

GGGAAGGCTA TCAGAGAOCC TGGT6CTGCC A C CTGOOCCT ACTCAAfiTGT CTACCTGQftG 3240 

CCCCTGGGOC GGTGCTGGCC AATGCTGGAA ACAT7CACTT TCCTGCA60C TCTT6GGTGC 3300 

50 TTCTCTCCTA TCTGTGCCTC TTCAGTGGGG GTTTGGGGAC CATATCAGGA GACCTGGGTT 3360 

6TGCTGACAG CAAAGATCCA CTTTCSSCAGG AGCCCTGACC CAGCTAGGAG GTAGTCTGGA 3420 

GQGCTGGTCA TTCACAGATC CCCATGGTCT TCA6CA6ACA AGTGAGGGTG GTAAATGTAG 3480 

GAGAAAGAGC CTTGGCCTTA AGGAAATCTT TACT0CT6TA AGCAAGAGCC AACCTCACAG 3540 

GATTAGGAGC TGGGGTAGAA CT6GCTATCC TTGGGGAAGA GGCAAGCCCT GOCTCTGGOC 3600 

55 GTGTCCACXrr TTCAGGAGAC TTTGAGTGGC AGGTTTGGAC TTGGACTAGA TGACTCTCAA 3660 

AGGCCCTTTT AGTTCTGAGA TTCCAGAAAT CTGCTGCATT TCACATGGTA CCTGGAACCC 3720 

AACAGTTCAT GGATATCCAG TGATATCCAT GATGCTGGGT GCCCCAGOGC ACAOGGGATO 3780 

QAGAGGTQAO AACTAATGGC TAGCTTGAOG GGTCTGCAGT OCAGTAGGGC AGQCAGTCAO 3840 

GTCCAT6TGC ACTGCAAT6C CAGGTGGAGA AATCACA6AG AGGTAAAATG GAGGCCAGTG 3900 

OO CCATTTCAGA GGGGAGGCTC AGGAAGGCTT CTTGCTTACA GGAATGAAGG CTGGG6GCAT 3960 

rrTGCTGGGG GGAGATGAGG CAGCCTCTGG AATGGCTCAG GGATTCAGCC CTCCCTGCOG 4020 

CTOCCTGCTG AAGCTGGTGA CTAOGGGGTC GCCCTTTGCT CAC3GTCTCTC TOGCCCACTC 4080 

ATCATQGAOA ACTOTG O TCA OAOGGGAQGA ATOGGCTTTG CTGCTTAXGA QCACASAStSl 4140 

_ ATTCAGTCOC CAG6CAGCCC TGCCTCTOAC TCCAASAGGO T6AAGTCCAC A6AA0TGAGC 4200 

05 TOCTSCCTTA GGGOCTCATT TGCTCTTCAT CCRGGGAACT 6AGCACAGGG GGCCTCCAGG 4260 

AGACCCTAGA TGTGCTCGTA CTCCCTCGGC CTGGGATTTC AGAGCTGGAA ATATAGAAAA 4320 

TATCTAGCCC AAAGCCTTCA TTTTAACAGA TGGGGAAA6T GAGCCCCCAA GATG6GAAAG 4360 

AAOCACACAO CTAAGGQAGO GCXniGGGGAG CCCOiXXXni GGCCTTGCTG OCACAOCACA 4440 

„ TTGCCTCAAC AACOQGCCOC AGAOTGCCCA GGCACTCCTO AGGTAGCTTC TGGAAATGGG 4500 

7U GACAAOTCCC CTG6AAGGAA AGGAAATGAC TAGAGTAGAA TGACAGCTAG CAGATCTCTT 4560 

CCCTCCTCCT CXXaCCXSCAC ACAAACXXX5C CCTCCCCTTG GTGTTGGCGG TCCCTGTGGC 4620 

CTTCACTTTG TTGACTACCT GTCAGCCCAG OCTGGGTGCA CAGTAGCTGC AACTCCCCAT 4680 

TGQT6CTACC T OQCT CTOCT GTCTCTGCAG CTCTACA60T QAGGCCCAGC AGAOGaAGTA 4740 

GOOCTOGCCA TGTTTCTGGT GAOCCAATTT GOCTQATCTT GOOTOTCTGA ACAGCTATTG 4600 

/5 GGTCCACCCC AGTCCCTTTC AGCTOCTGCT TAATGCCCTG CTCTCTCCCT GGCCCACCTT 4860 

ATAGAQAGCC CAAAGAGCTC CTGTAAGAGG GAGAACTCTA TCTGTGGTTT ATAATCTTGC 4920 

ACGAGOCACC A6AGTCTCCC TGSGTCTTGT GATGAACTAC ATTTATCCCC TTTCCTGCXrC 4980 

CAACCAOUM CTCmCCTT CAAAGAGGGC CTGGCTGGCT CCCTOCACCC AACT6CAC0C 5040 

ATGAGACTOG GTCCAAGAGT CCATTCCOCA 6GIQGQAGCC AACTGTCAGG GAGGTCTTTC 5100 

OU CCACCAAACA TCTTTCAGCT GCTGGGAGGT GACCATAGGG CTCTGCTTTT AAAGATATGG 5160 

CTGCTTCAAA GGCCAGAGTC ACAGGAAGGA CTTCTTCCAG GQAQATTAGT GGTGATGGAG 5220 

AGGAGAGTTA AAATGACCTC ATGTCCTTCT TCTCCACGGT TTTGTTGAGT TTTCACTCTT 5280 

CTAATGCAAG GGTCTCACAC TGTGAACCAC TTAGGATGTG ATCACTTTCA GGTGGCCAGG 5340 

p- AATGTTGAAT GTCTTTGGCT CAGTTCATTT AAAAAAGATA TCTATTTGAA AGTTCTCAGA 5400 

o5 GTTGTACATA TGTTTCACAG TACAGGATCT GTACATAAAA GTTTCTTTCC TAAACCATTC 5460 

ACC3UUSAGCC AATATCTAGG OVTTTTCTTG GTAGCACAAA TTTTCTTATT GCTTAGAAAA S520 

TTGTOCTCCT T6TTATTTCT GTTTGTAA6A CTTAAGT6AG TTAGGTCTTT AAGGAAAGCA 5580 
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ACGCTCCTCT GAAATGCTTG TCTTTnTCT GTTGCCX3AAA TAGCTGGTCC TTTTTOGGGA 
GTTAGATGTA TAGAGTGTTT GTATGTAAAC ATTTCTTGTA GGCATCACCA TGAACAAAGA 
TA7ATTTTCT ATTTATTTAT TATATCTGCA CTTCAAGAAG TCACT6TCAG A6AAATAAA0 
AATTGTCTTA AATGTCAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



Seq ID NO: 451 Protein sequence 
Protein Accession ft; XP 051860.2 



5640 
5700 
5760 



PCTAJS02/12476 



KD6V13LSTEV 
LBDKVQSWKP 
DMRAEVGLLS 
HMGQQLVGQY 
YNSLGHCFFT 
DQSAVSTFNM 
MKHABSNYRA 
AYKNQDH6AW 
GTENKDmLIW 
lAFRLtiNASfQ 
SEyPGSYLTK 
RPLYIiSSALT 
RGTTFSILSD 
HEREKFAFCS 
TKDHFLEVXM 
NSILQ6IFHQ 
GPK6SFRPIH 



11 
I 

GDTLVIASTD 
RKirVMSEMB 
FIKFHIiACDV 
BDGPESRNTF 
ANFIOQILXNC 
GMIIDNGVKT 
LRGGDVHLDS 
GPG6LDHS6R 
SCPBNNVTGZ 
KDNWLVRHPD 
RSTHYQQYQP 
VHKRLLKQTS 
MKGCERIKIK 
ESSKQHFFHL 
LFNYVATIPD 
VTU>TEDHKA 



21 
I 

ACYSR6RACR 

YSKYQAEEFQ 
DKCYPYRNHI 
DERGGYDPPT 
DHCLGIiLVXS 
AAAGSEE1T3F 
TEASAKDKRP 
CRFADMGIGL 
TLPIGQNPPI 
AFEDVPITSR 
CIKVPOHRGA 
WTLQKGYTI 
KTGVFVRTIjQ 
ALIPKNAGVS 
HNDFAYZEVD 
NSIVLHASXG 
KIPQWPIPV 



31 
I 

SYRVRFXXSGK 
VLPCRSCAHI 
CNFFDFDTFG 
YXRDLSXRBT 
GTUiPSDRDS 
WPZFBHVPTG 
FLSIZSARYS 
TLASGGTFPY 
RGIQLYDGPI 
VFFGEPGFHP 
ICSGCYAQMY 
HWDQTAPAEL 
MDKVBQSYPG 
DCTATAYPKP 
GKKITPSSBXS 
BYVSRGPWTR 
VKKkjuj 



41 

I 

PVRPKLTVTX 

QVKVAGKPMY 
OJIKFALGPK 
PSRCVTVH6S 
KKCRMZTGDS 
PSVGKYSFGY 
PHODADFLKP 
DDGSKQSIKM 
NIQNCTFRKP 
NQUaHDGDXT 
IQAYKTSHUl 
AIWLINFNKG 
RSHYYWDEDS 
TESAWDVPM 
IQVWIDGNQ 
VIiEKU3ADRG 



51 
I 

DTOVKSTXUi 
LHIGSEIDGV 
AAHLBGTELK 
HGLLIKDWG 
YP6YZPRPSQ 
SEBZPLGKPY 
REPAZIRHFI 
SLFVGSSGNV 
V ALESR HTSA 
SVFBDVDGSV 
MKZIKNDFPS 
DWXRVGLCYP 
GLLFLKLRAQ 
PKKLFGSQLK 
GRWSHTSFR 
LKLKB^tAFV 



Seq ID NO: 452 DNA sequence 

Nucleic Acid Accession Ut Eos sequence 

Coding sequence I 261.. 2861 



31 




TGCTATOGGA 
6CCAG6GTCT 
A6CTCX3CGQC 
GAGGOGTGAC 
CT6G6AGGCA 
6CTTC0CTG0 
AAOOCTGGAA 
TOCTGCTCAC 
TCATTAAAGA 
GAGGASAGCT 
TGTATGGAAG 
GOGTTGGTAA 
TGAACAAGAC 
GGGGCCACCG 
CTGACOGGTT 
ACG0GGT6CC 
TGGATGACAT 
GATTTAGACA 
ACCATATTGA 
AGACAGAGCA 
AGTGGA06GA 
CAGACCTCTG 
CTACAATGGA 
GGTTTGCTTG 
GQAAQOCTGT 
TGAACTT6GA 
CT6ATTACTC 
CCAA0CAG6T 
GCGTGGACAT 
T6GAGGACAA 
TTGGGGGCCA 
TGAA8CATA7 
AT6TAGA0GA 
ATACATTCTC 
T6GGCTATAA 
CTTTTGACCA 
ACAGCAAGAT 

g golag actg 

ACTGT6C06C 
06GQCCCCTC 
TCTATAACAA 
AAACCACGGA 
ACAGOCCTCA 
TCATT6CCTA 
ACAGCTQCCA 
GGGGCATTTT 
CXITGCOGCTG 
OOCftCTCATG 
CAGAGGAA7T 
AGTGAGCTCC 
CTCCAGGAGA 
TAGAAAATAT 



41 

I 

CAGAGGCTGG 
QAA0CX91GAT 
G0CTG6CGGT 
ACTGTCTOGG 
6GACTTCCTC 
G60CACATCC 
CCCTGOCCAT 
CTCTTCTGCC 
CCA06ACGAG 
GCATGCTGG6 
GGCTGATGAA 
AG6AQG0GCT 
CCTTCAOCCA 
TGGAGTTATT 
TGACACCTAT 
CGATGGCAGG 
G6CCAGGAAG 
CCCTTGGAGT 
ATATCATGGA 
TGGOGAATAT 
GTGGTTCGAT 
GAAAGCTCAC 
TGGAGTTAAC 
CTA0QAGO6G 
GAG60CCAAA 
GGATAATGTA 
CATGTACCAG 
CAAAGTGGCA 
GOQGGOQOAG 
ATOCTACCCC 
CATCAAGTTT 
GGGACAGCAQ 
AA6GGGAGGT 
TOSCTQOGTC 
CTCTTTGGGC 
CTGTCTTGGC 
GTGCAAGATG 
CAATGCTGrlG 
T6CAGGATCT 
CGTOQGAATG 
COSAGCACAT 
GGC3CTCT6CC 
CCA06AOGOC 
CAAGAACCA6 
TTTCAGAGGG 
6C1GGG6GGA 
CCTGCTGAAG 
ATGGAGAA6T 
CAGTCCCCAG 
TGGCTTAGGQ 
CCCTAGATGT 
CTAGCCCAAA 



TACAGAAACC 
GCTCTGGGAT 
CTGGTG66TC 
TATOAOCCAC 
ACA8TCCAT0 
CACTGCTTCT 
CTCCTTGTCA 
ATCACAGAGO 
TOCACCTTCT 
GA66AAACT0 
TACTCCCCAG 
TCCAACTACC 
AAGGACAA6C 
GACOOSCTQA 
GAOCAOOGGO 
GAGGCTCAGG 
GATGAGGCAG 
C7GGTGACTA 
GTGGTCA6A6 
GCAQO0CT6C 
OCTCATTTGC 
GCTOGTACTC 
6CCTTCATTT 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2620 
2880 
2940 
3000 
3060 
3120 



355 



wo 02/086443 

TAACAGATGG GGAAAGTGAG CCCCCAAGAT GGGAAAGAAC CACACAGCTA AGGGAGGGCC 3180 

TGGGGAGCCC CACCCTAGCC CTTGCTGCCA CACCACATTQ CCTCAACAAC CGGCCCCAGA 3240 

GTGCCCAGGC ACTCCTXSA6G TAGCTTCT6G AAATGGGGAC AAGTCCCCTC GAAGGAAAOS 3300 

AAATGACTAG AGTAGAATGA CAGCTA6CAG ATCTCTTCCC TCCT G CT C OC AGOGCACACA 3360 

AACCOGCCCT oOX.T'ltSGl'G TTGGCG6TCC C l tfi XaGC ClT CACTTTGTTC ACrACCTGTC 3420 

AGCXX3W3CCT GGGrTGCACAG TAGCTGCAAC TCCCCATTGG TGCTACCTGG CTCTCCTGTC 3480 

TCTGCAGCTC TACAGGTGAO GCCCAGCAGA GGGAGTAGGQ CTOGCCATGT TTCTGGT6AG 3540 

CCAATTTGOC T6ATCTTGGQ TGTCTGAACA 6CIATTGGGT 0CACCCCA6T CCCTTTCAGC 3600 

TGCTGCTTAA TOCCCTGCTC TCTCCCTGGC CXACCTTATA GAGA6CCCAA AGAGCTCCIG 3660 

TAAGAGQ6A0 AACTCTATCT GTGGTTTATA ATCTTGCA06 AG6CACCAGA GTCTCCCTG6 3730 

GTCTTGTGAT GAACTACaTT TATCXXXmT CCTGCCCCAA CCACAAACTC TTTCCTTCAA 3780 

AGAGGGCCTG CCTGGCTCCC TCCACOCAAC TGCACXXATO AGACTOGGTC CAAGAGTCCA 3B40 

TTCCCCAGGT GGGAGCCAAC TGTCAGGQAO GTCTTTCCCA OCAAACATCT TTCAGCTGCT 3900 

GGGAGGTGAC CATAGGGCTC TGCTTTTAAA GATATGGCTG CTTCAAAGGC CAGAGTCACA 3960 

GGAAGGACTT CTTCCAGGOA GATTAGTOGT GATGGAGAGG A6AGTTAAAA TGACCTCATG 4020 

TCCTTCTTGT CCA06GTTTT GTTGAGTTTT CACTCTTCTA ATGCAAGGGT CTCACACTGT 4080 

GAACCACTTA 0GATGT6ATC ACTTTCAGGT GGCCA6GAAT GTTGAATGTC TTTGGCTCAG 4140 

TTCArrTAAA AAAGATATCT ATTTGAAAGT TCTCAGAOTT GTACATATGT TTCACAGTAC 4200 

AGGATCTGTA CATAAAAGTT TCTTTCCTAA ACCATTCACC AAGAGCCAAT ATCTAGGCAT 4260 

TTTCTTGGTA GCACAAATTT TCTTATTGCT TAGAAAATTO TCCTCCTTGT TATTTCTGTT 4320 

TOTAAGACTT AAOTGAGTTA OSTCTTT A AC GAAA6CAAGG CTOCTCTGAA ATGCTTGTCT 4380 

T' riTfCl ' GTT GCOGAAATAG CTGGTCCrTT TTCGG6AGTT ASATCTATAQ AGTGTTTGTA 4440 

TCTAAACATT TCrTGTAGGC ATCACCATGA ACAAAOATAT ATTTTCTATT TATTTATTAT 4500 

ATGTGCACTT CAAGAAGTCA CTGTCAGAGA AATAAAGAAT TGTCTTAAAT GTCATGATTG 4560 

GAG A TGTOCT TTGCATTGCT TG6AAGGGGT GTACCTAGAG CCAAGGAAAT T6GCTCTGGT 4620 

TTGGAAAAAT TTTGCTOTTA TTATAGTAAA CATACAAAGO ATGTCAAAAA AAAAAAAAAA 4680 
AAAAAAAAAA AAAAAAAAAA AA 

Seq ZD KO: 453 Protein sequence 
Protein Accession Ut Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MGAAGRQDFL FKAMLTISWL TLTCFPGATS TVAAGCPDQS PELQPWNPGH DQDHHVHIGQ 60 

GKTLLLTSSA TVYSIBISB8 GKLVIKDHDE PIVLRTRHIL XDHGGELHAG SAIiCPFQQTP 120 

TIZLYGRADB GZQPDPYYGL mOVGRGGA LELBGQXKLS HTFLNKTLHP OGMABGGYFP 180 

ERSWSnSVZ VHVIDPKSGT VZHSDRFDTy RSKKESERLV QYLNAVPDGR ILSVAVNDEG 240 

SRNLDDMARK AMTKLGSKHF LHLGFRHPtfS FLTVKGNPSS SVEDHIEYHO HRGSAAARVF 300 

KI*FQTEHGEY FNVSIiSSEWV QDVEWTEWFD HDKVSQTKGG EKISDIAfKAH PQKICNRPID 360 

IQATTMDGVN LSTEWYKKG QPYRFACYDR 6HACRSYRVR PMXSKPVRPK LTVTIDTWW 420 

STlLNLmNV QSWKPQXniV lASTDYSMVQ ABEFQVLPCR 8CAPKQVKVA GKPMYIiHIGB 480 

BZDGVDHRAB VGLLSBNIZV KGEMEDKCYP YRBBIOIFFD PD TFGCT IKP ALGFKAABLB 540 

GTKLKHMGQQ LVGQYPIHPH LAGDVDEKGG YDPPTyiRDL SIHHTPSRCV TVHGSNGUiI 600 

KDWGYNSU5 HCFPTEDGPE ERNTFDHCU3 LLVKSGTLLP SDRDSKMCKM ITEDSYPGYI 660 

PKPRQSCNAV 8TFWKANPNN NLXHCAAAGS EETGFHFIFB BVPTGPSVQM YSFGYSBEIZP 720 

LGKPyNHRAH aNYRAGMIZD N6VKTTEASA KDKRPFLSXZ SARYSPHQDA DPLKPREFAI 780 

IRHPIAYKHQ DRGAffLROGD VWLDSCHFRO EAQBSFLLTG MKAGGIItLGG DEAA8GMAQ0 840 
FSPPC31CLI*K LVTTGSPFAH VSLAHS 

Seq ZD NO: 454 DMA sequence 

Kucleic Acid Accession S: llM_013282.2 

Coding sequences 85.-2466 

1 11 21 31 41 51 

I 1 I I I I 

OGACTCCTTA GAGCATOGCA TGGCTCAGAG GTGCIOOTAA AACTGATGGO GGTTTTTGCT 60 

6TCCCTCCCC TCAGCXSCaSA CACCAT6TGG ATOCAGOTTC GGAOCATGGA GGG6AGGCA6 120 

ACOCACAOOQ TGGACTOGCT GTCCAGGCTG ACCAAGGTGG AGGAGCTGAO GOGGAAGATC 180 

CAGGAQCTGT TCCACGTQGA GCCAGGCCTG CAGAGGCTGT TCTACAGGGG CAAACAGATQ 240 

GAOGAOGGCC ATACCCTCTT CGACTAOGAG GTCCGCCTGA ATGACAOCAT CCAGCTOCTG 300 

GTOOGCCAGA GCCTC6TGCT CCCCCACAGC ACCAAGGAGC GGGACTOOQA OCTCTOOSAC 360 

ACOGACTCOS GCTGCT60CT GGGCCAQAGT GAQTCAGACA AGTCCTCCAC 0CAC6G06AG 420 

GCGGCOGCCG AGACTGACAG CAGGCCAOCC GATGAGGACA T6TGGGATGA GACGGAATTG 480 

GGGCTGTACA AQGTCAATGA GTAOGTCGAT GCTOGGGACA CX5AACATGGG GGOGTGGTTT 540 

GAGGOSCAGG TGGTCAGGGT QACGCGGAAG GCCCCCTCCC GGGACGAGCC CTGCAGCTOC 600 

AC6TCCAGGC CGGCX3CTGGA GGAGGACXTTC ATTTACCAOG TGAAATAOGA OQACTAOOOS 660 

GAGAAOGGOG TGGTCCAGAT GAACTCCAGG GACGTCC3GAG CGCG0GCC06 CACCATCATC 720 

AAGTGGCAGG ACXTTCGAGGT GGGCCAGGTG GTCATGCTCA ACTACAACCC OGACAACCCC 780 

AAGQAGCXK3G GCTTCTOGTA CGAOGOGGAG ATCTCCAGGA AG0G06AGAC CAGGACGGC6 840 

OGGGAACTCT AOGCGAAOGT OGTGCTGGGG GATGATTCTC TGAAOGACTO TOQGATCATC 900 

TTCGTGGA06 AAGTCTTCAA GATTGAGC3G6 COGGGTCAAG GGAGCCCCAT GSTTOACAAC 960 

CCCATGAGAC GGAAGAGCGG GCCGTCCTGC AA6CACTGCA AQGACGACGT GAACA6ACTC 1020 

TGCOGGGTCT GCGCCTGCCA CCTGTG06GG GGCOGGCAOG ACCCC36ACAA GCAGC TCATG 1080 

TGCGATGAGT GOGACATOGC CTTCCACATC TACTOCCTGG ACCCGCCCCT CAGCAGTGTT 1140 

CCXSUJOGAGG ACGAGTGGTA CTGCCCTX3AG TGCC3GGAATG AT6CCAG0GA GGTGGTACTO 1200 

GCGGGAGAGC GGCTGAGAGA GAGCAAGAAG AAGGGGAAGA TGGGCTCGGC CACATGQTOC 1260 

TCACAGCX3GG ACTGGGGCAA GGGCATGGCC TGTGTGGGCC GCACCAAGGA ATGTACCATC 1320 

GTOCOGTCCa ACCACTAC3GG ACCCATCC06 GGQATCCC06 TCGGGACCAT GTGGCGffTTC 1380 

OGAGTCCAGG TCAGCGAGTC OGGTGTCCAT CGGCOCCAOG TGGCTGGCAT ACAOGGOOGG 1440 

AGCAACGACS3 GAGOGTACTC CCTAGTCCTG GCGGGGGGCT ATGAGGATGA OGTGGAOCAT 1500 

GGGAATTTTT TCACATACAC GOGTAGTGGT GGTOGAGATC TTTCOGGCAA CAAGAGGACC 1560 

GCGGAACAGT CTTGTGATCA GAAACTCACX: AACACCAACA GGGOSCrGGC TCTCAACTQC 1620 

TTT6CTCCCA TCAATGACCA ASAAGGQGCC GAGGCCAAOQ ACTGGCGGTC GGGGAA6CC3G 1680 

GTCA6GGTG6 TGGQCAATGT CAAG66TG6C AAGAATAGCA AGTAOGCCCC 0GCTGAG66C 1740 

AACOSCTAOG ATGOCATCTA CAAGGTTGTG AAATACTGGC COGAGAAGGG GAAGTCOGGG 1800 

TTTCTCGTGT GGOGCTACCT TCPGOGGAGG GAOGATGATG AGCCIGGCCC TTGQAOGAAG 1860 

GAGGGGAAGG AOC36GATCAA 6AAGCTGGGQ CTGACXIATGC AGTATCCAGA AG6CTACCTG 1920 



356 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

GAAGOCCTGG CCAACXS3AGA GOGAGA<S\AG GAGAACAGOV 
CAGGAGGGGG GCTTOGCGTC CCCCAGGACG GGCAAGGGCA 
GGA6GTG6CC OGAGCAGGGC OGGGTCCCOG CGCOGGACAT 
CCCTACAGTC TCAG6GCCCA 6CA6A6CAGC CTCATGAGAQ 
CT6T6GAATG AGGTCCT66C GTCACTCAA6 GAC066C0GG 
TTGTTCCTGA GTAAAGTGGA GGAGAOGTTC CAGTGTATCT 
CGGCCCATCA CGACCGTGTG CCAGCACAAC GTGTGCAAGG 
CGGGCACAGG TGTTCAGCTG CXXTTGCCTGC CGCTACGACC 
CAGGTGAACC AGCCTCTGCA GACOGTCCTC AACCAGCTCT 
OGGTGATCTC CAAGCACTTC TCGACAGGOQ TTTTGCTGAA 
CATOGGCACT GATTTTGTTC TTAC5TGG6CT TAACTTAAAC 
CCTAAAAAGG TTTGTCTTCC TTTTTTTTTA TTTTTATTTT 
GAATTTATGT ATTCTGGCTA AAAGTTGGAC TTCTCAGTAT 
CATAAAAGCC TGCAATTTCT CGACAAAACA ACACAAGATT 
ACTACGTGGT GTGGAGGCTG TTGATGTTTC l^TGTCAAG 
CAACTCTTTA A6AA000GAC AGGATCAGTC CTTCTCTAGO 
AGCAAGCATC TTCCTGACAG CATTTT6TCA TCTAAASTOC 
TGGCC0GT6G CAGCCCGTGG OVTQGCGTGG CTCAGCTGTC 
AAAGAGGAAA CATCTCGGGC CTAGTTCAAA CCTTTGCCTC 
TGCTtAGCGT CTGAGATCC6 OGTGAAAAGT CCTCT60CCA 
CAOGCAGAAA TQGCCTCAAO aOOACTCTGC TOCAOGTGGG 
TGTCCQAOSA AOGCGGCCAC GQAOGGAOQC CAGCACAC6A 
GATTCGTTCC TTCTTTCTAA AGACGACAGT CTTTGTTGTT 
GTCAACCA6A TTCTAGAAAC TGCGGTCATC CAGTTCTTCC 
GGAACCGTTT GAGCCTTATA GATCATTTAC ATTCAATTTT 
CTTACAAGAC GGTTTTTTTT TAATTTTTTT TTCTCTTAAT 
TTTTTTTT6T ASTTACTOTA TATGTAOCAA 6AAAGATATA 
TTGTTTT l X yr ATTTTTTTTC TTTTGAAAGG GTTTGTTAAT 
TTGCAGCCTA TACCTCAATA AAACAGGGAT ATTTTAAATC 
6A6CAATGTT ATT7TTAAAG G6TTTTTTTC ACCTOCTTAT 
AG6GAAGAAT GAGACAATTT TGT6TA0GCT TTTTCTAAA0 
TTAGATTCTC AGAATAAATG TTTTTCAGAO ATTQAAAAAA 

Seq ID NO; 455 Protein sequence 
Protein Accession ft: HP 037414.2 



AGAGGGAGGA 
AGTGGAAGCG 
GCAA6AAAAC 
AOGACAAGAG 
06AG0GGCA6 
GCTGTC3W3GA 
ACTGCCTGGA 
TGGGCCGCAO 
TCCCCGGCTA 
AACGTGTOGG 
AGGTAGTGTT 
TCAAATCTAT 
TGTGTTTAGT 
TTTTAAAGAT 
TTCTCAGAAG 
GTTCIG6CCC 
AGTGACATGG 
TGTTGAAGTT 
AAAGCCATCr 
OGAGAGCAGG 
GCCAGG06TG 
AGTCAGGTGC 
AGCACTGAAT 
TGACACCGGA 
TTTAACTCAG 
GAACACATTT 
ACGTTAG6GT 
TTTTCTAATT 
ACATACCTGC 
TCTTAGATTA 
TCCAGTACTT 



I 

MWIQVRTMDG 
YEVRLNDTZQ 
PADEDMHDET 
OVIYHVKyDD 
AEISRKRETR 
SCKHCKDDVN 
PBCRUDASEV 
ZPGXFV6TMM 
SGGRDLSCaiK 
GGKKSKYAPA 
LGLTNQYPEO 
SPRRTSKKTK 
TFQCICOQEL 



11 
I 

RQTHTVDSLS 
LLVSQSLVLP 
ELGLYKVNBY 
YPENGWQNN 
TARBLYANW 
RLCRVCACHL 
VLAGERLRES 
StFXCVQVSBSG 
RTAEQSmX 
EGNRYDGIYK 
YLBALANRER 
VEPYSLTAQQ 
VFRPITTVOQ 



RLTKVEELRB 



VDARDTHMSA 
SRDVRARART 

LGDDSUJDCR 
CGGRQDPDKQ 
KKKAKMASA7 
VHBFHVAGXH 
LTNTIiSALAL 
WKyWPEKGK 
EKEtlSiCREES 
SSIjXREDKSN 
BIlVCfCDCLDR 



31 

1 

KIQEIiFHVEP 
SDTDSGOCLG 
WFEAQWRVT 
ZIKHQDIiEVa 
IIFVDEVFKl 
IiMCDECDMAF 
SSSQRDWGKG 



41 

I 

GLQRIiPYRGK 



NCFAPIMDQE 
SGFIjVWRYLL 
EQQEGGFASP 
AKLNNEVLAS 
SFRAQVF8CP 



Seq ZD NO: 456 DKA sequence 

Nucleic Acid Accession #: NM_001200.1 

Coding sequence: 325.. 1514 ~ 



GGGGACTTCT 
TGCCCCAGOS 
TGCCC3QACAC 
GAGAAG6AGG 
AGAGTTTTTC 
CTGOGGTCTC 
TTCCCCAGGT 
TOGQGGQGGC 
TOGAGTTGOG 
CCGTX3GTGCC 
CCGCCCOMSA 
ACCATGAAGA 
TCTTTAATTT 
TC06AGAACA 
TTTATGAAAT 
ACACCAGGTT 
TGATG06GTG 
T6GAGGAGAA 
ATGAACACAG 
GGCATOCTCT 
AGTCC3KGCTG 
GGATT6TGGC 
TGGCTGATCA 
ACTCTAAGAT 
ACCTTGACGA 
GT6GGTGT0G 



11 

1 

TGAACTTOCA 
GAGCCTGCTT 
TGAGACOCTG 
AGGCAAAGAA 
CATGTGGAOG 
CTAAASGTOQ 
CCTCCTGGOC 
GT06T0GGGC 
GCTGCTCAGC 
CCCCTACATG 
CCACCX3GTT6 
ATCTTTGGAA 
AAGTTCTATC 
GATGCAA6AT 
CATAAAACCT 
GGTGAATCAG 
GACTGCACAG 
ACAAGGTGTC 
CTGGTCACAG 
CCACAAAAGA 
TAAGAGACAC 
TrcCCOGGOQ 
TCTGAACTCC 
TCCTAAGGCA 
GAATGAAAAO 
CTAGTACAGC 



21 

I 

QGGAGAATAA 
CQCCATCTCC 
TTCCCAGOGT 
AAGGAACGGA 
CTCTTTCAAT 
ACCATGGTG6 
GGOGCGGCTG 
CGCCCCrCAT 
ATGTTC3GGCC 
CTAGACCTGT 
GAGAGGGCAG 
GAACTACCAG 
CCCACQGAGG 
GCTTTAGGAA 
GCAACAGCCA 
AATGCAAGCA 
GGACACGCCA 
TCCAAGAGAC 
ATAAGGCCAT 
GAAAAAGGTC 
CCTTTGTA06 
TATCAOGCCT 
ACTAATCATG 

611Y3TATTAA 
AAAATTAAAT 



31 

! 

CTTGCGCACC 
6AGCCCCACC 
GAAAAGAGA8 
CATTOSGTOC 
GGAOGTGTCC 
CCGGGACCCG 
GCCTOGTTCC 
CCCAGCCCTC 
TGAAACAGAG 
ATC3GCAGGCA 
CCAGC06AGC 
AAAOGAGTGG 
AGTTTATCAC 
ACAATAGCAG 
ACTCGAAATT 
G6T66GAAAG 
ACCATOGATT 
ATGTTAGGAT 
TGCTAGTAAC 
AAGCCAAACA 
TGGACTTCAO 
TTTACTGCCA 
CCATTGTTCA 
C3GACAGAACT 
AGAACTATCA 
ACAXAAATAT 



RKAPSRDBPC 
QWMUnfNPD 
ERPGEGSPMV 
HIYCLDPPLS 
MACVGRTKEC 
VLAQOYEDCV 
GAEAKDWRSG 
RSDDDEPGPW 
RTGKGKWKRK 
LKDRPASGSP 
ACRYDIiGRSY 



41 

I 

CCACTTTGOO 
GCCCCTCCAC 
ACTGCGGGGC 
TTGOG^AGG 
CCGCOTGCTT 
CTGTCTTCTA 
G6AGCTGGGC 
TGAOGAGGTC 
ACCXIACOCCC 
CTCAGGTCAG 
CAACACTGTG 
GAAAACAACC 
CTCAGCAGAG 
TTTCCATCAC 
CCCOGTGAOC 
TTTTGATGTC 
CGTGGTGGAA 
AAGCAGGTCT 
TTTTGGCCAT 
CAAACAGOGG 
TGAOGTGGGG 
CGGAGAATGC 
GAOGTTGGTC 
CA6TGCTATC 
GGACKIGGTT 
ATATATA 



GGAG6A6CAG 
GAAGTOGGCA 
CAAGGTGGAG 
CAA0S0CAA6 
CC0GTTCCA6 
GCTOTTGTTC 
CAGATCCTTT 
CTATGCCATG 
OGGCAATQGC 
AGGGCTOGTT 
TCCTCOGTTC 
ACATTTTCAG 
TCTTTGAAAA 
GGAATCAGAA 
TTGCTGCCAC 
CCAAGGTCAG 
TTCCCOGTGG 
GTT6CAAGGA 
CCCACX^VGAC 
GAGTTGGGGC 
T6ACTGA06C 
AAGTGOCTTT 
TATTX3AAAAT 
TGGGTGCTTG 
CAAGTGAGAA 
TCTAAATGAA 
TTGGrTGTTT 
TTACCAAAGT 
AGACAAACTG 
TrAATGTATT 
TGTOCAGATT 



51 
I 

QHEDGHTLFD 
6EAAAETDSR 
SSTSRPAZiEE 
NPKERGFWYD 
DNPMRRKSGP 
SVPSEDEWYC 
TZVPSNHYGP 
DHSKFFTYTG 
KPVRWBNVR 
TKEGKDRIKK 
SAGOGPSRAG 
FQIiFLSKVEE 
AMQVNQPLQT 



51 
I 

COGGTGCCTT 
TCCTOGGCCT 
OG GCACC OBG 
TOCTTTGACC 
CTTAGACGGA 
GOGTTGCTGC 
C36CAGGAAGT 
CT6AQ0GAGT 
AGCAGGGACG 
OOGGGCTCAC 
OSCAGCTTCC 
CGGAGATTCT 
CTTCAGGTTT 
OGAATTAATA 
AGACTTTTGG 
ACCCCCQCTG 
GTGGCCCACT 
TTGCACCAAG 
GATGGAAAAG 
AAAOGCCTTA 
TGGAATGACT 
CCTTTTCCrC 
AACTCTGTTA 
TGGATGCTGT 
GTG(Stf3GGTT 



1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3460 
3540 
3600 
3660 
3720 
3780 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



Seq ID NO: 457 Protein sequence 
Protein Accession S: NP 001191.1 



357 



wo 02/086443 

1 11 21 31 41 51 

I I I I I I 

MVAGTRCLLA U.LPQVLLGG AAGLVPSLfSt RKPAAASSGR PS5QPSDEVL SEFELRXiLSM 60 

FGLKQItPTPS KDAWPPYMX* DLYRRBSGQP GSPAnSSRIiB HAASHASTVR SFHBEESLES 120 

X«PSTSGKTTR RFFRILSSIP TBEPZTSAEIi QVFRBQKQDA UaiNSSFBBR ZNiySXIKPA 180 
TANSKFPVTR UiOT 

Seq ID NOt 4S8 DHA sequence 

Nucleic Acid Accession #: im_O01999.2 

coding Beguencet 1..8736 

1 11 21 31 41 51 

i I 1 ! I 1 • 

ATGGGGAGAA GAOGGAGGCT GTGTCTCCAO CTCTACTTCC TGTGGCTGGG CTGTGTGGTG 60 

CTCTGGGCGC AOGOCAOGaC OSGCCAGCCT CftCCCTGCtC G60CCM6CC GOCC OGGCCC 120 

CAOCOG006C 06CAACAG6T TOGGTCO S CT ACAGCA06CT CTGAAGG0G6 GTTTCTAGOG 180 

CCCGAGTATC GQBA6GAGGG TGCCGCAGTG GCCAGCOQOG TCCGCOSGCG AGGACAGCAG 240 

GACGTGCTCC GAGGGCCCAA 0GTGTGCX3GC TCCAGATTCC ACTCCTACTG CTGCCCTGGA 300 

TCGAAGACGC TCCCTGGAGG AAACCAGTOC ATTGTCCCGA TTTGTAGAAA TAGTTGTGGA 360 

GATQGATTTT GTTCCOGTCC TAACATGTGT AC T TOTTCC A GT6GGCAAAT ATCATCAACC 420 

T6TGGATCAA AATCAATTCA OCAGT6C31GT GTGAGAT6CA TGAATGGT66 OACCTOTGCA 480 

GATGACCACT GCCAGTGCXa GAAAGGATAT ATTGGAACTT ATTGTGGACA A0CXX5TCTGT 540 

GAAAATGGAT GTCAGAATGG TGGACGTT6C ATCGCCCAAC CGTGTGCTTG TGTTTATGGQ 600 

TTCACTGGTC CACAGTGTGA AAGAGATTAC AGGACAGGCC OGTGTTTCAC TCAGGTCAAC 660 

AACCAGATGT GCXAAGGGCA OCTGACAGGC ATTGTCTGCA OGAAGACTCT 6TGCTGT6CC 720 

ACCACTGGAC GGGCGTGG6G CCATCCCTGT GAGATGTGTC CAGCCCAGCC TCAGCCCTGC 780 

0GAC6GGGTT TCATCCCCAA O^TCOGCACT GGAGCTTGCC AAGATGTT6A T6AATGCCAG 840 

OCTATCCCAG GGATATGCCA AGGAGGAAAC TGTATCAATA CAGTGGGCTC TTTTGAATGC 900 

AGATGCCCTG CTGGTCACAA ACAGAGTGAA ACTACTCAGA AATGTGAAGA CATTGATGAO 960 

TGCAGCATCA TTCCTGGGAT ATGTGAAACT QGTGAArGTT CCAACACCST GGGAAGCTAT 1020 

■ mWlGTri GTCCAOGTGG ATATGTAACC TCAACAGATO 6CTCT0GATG CATOGATCRG 1080 

AOAACAGGCA TGTQTTTCTC GGGCCT G GTG AATGGCCGCT GTGCACAAGA GCTCCCGGGG 1140 

AGAAT6AC6A AAATGCAGTQ CTGCTGTGAG CCTGGCCGCT GCTGGGGCAT OGGAACCATT 1200 

CCTGAAGCCT GTCCTGTCAO AGGTTCTGAG GAATATOXSV GACTrTGCAT GGATGGACTT 1260 

CCAATG6GAG GAATTCCAOG GAGT6CIG6T TCCAGAOCTG OAGGCACTGG GQGAAATGGC 1320 

TTTGCCCCAA GTGGCAATGG CAATGGCTAT GOCCCAGGAG GGACAtSGCTT CATCCCCATC 1380 

CCTGGAGGCA ATGGCTTTTC TCCTOGOGTT GGGGGAGCCG GTGTGGGGGC OGGGGGACAG 1440 

GGACCTATCA TCACTG<5ACT AACAATTCTG AACCAGACAA TAGATATCTG TAAGCATCAT 1500 

GCTAACCTTT GTTTAAA76G AOGCTGTATA CCAACTGTCT CAASCTAC06 ATGTGAATGC 1560 

AACATGGQTT ATAAGCASGA TGCAAATQGA GATTOTATAO ATOTTOATGA ATQCACATCA 1620 

AATCCCTGCA CTAATGQAGA TTOTOTTAAC ACACCTOOTT CCTATTATTO TAAATGTCAT 1680 

GCTGGATTCC AGAGGACTCC TACXAAOCAA GCATGCATTG ATATTGATGA GTGCATCCAG 1740 

AATGGGGTTC TTTGTAAAAA CGGTCGATGC GTGAACTCAG ATGGAAC3TTT CX^AGTGCATT 1800 

TGCAATGOCO GCTTTGAATT AACTACAGAT GGAAAAAACT GTGTTGATCA TGATGAATGT 1860 

ACAACTACCA ACATGTGTTT GAATGGAATG TGCA7CAAT0 AAGATGGCAO CTTCAASTQC 1920 

ATCTGCAAAC CAGGATTTGT CTT6GCTCCA AATG0G08TT ACTGTACTGA TGTT6ATQAA 1980 

TGCCAGACCC O^GGAATCTG CATGAATGGG CACTGCATCA ACAGTGAAGG GTCCTTCCGC 2040 

TGTGACTGTC CCCCAGGCCT GGCTGTGQ G C AT0GATG6AC GTGTGTGTGT TGATACTCAC 2100 

ATGCGCAGTA CCTGCTATGG AGGAATCAAG AAAGGAGTGT QT6T6C6T0C TTTCCCOGGT 2160 

GCAGTGACCA AGTCOGAATG CT6CTGTGCC AAT0CA6ACT ATOGTTTTGQ AGAACCCTGC 2220 

CAGCCATGCC CTGCAAAAAA TTCAGCTQAA TTCCAOGGOC TTTGTASTAG T9GA8TAGGT 2280 

ATCACTGTGG ATGGAAGAGA TATCAATGAA TGT6CTTTGG ATCCTGATAT ATGT6CCAAT 2340 

6GGATTTGTG AAAACTTAOG TGGT7VGTTAC OGTTGTAATT GCAACAGTGG CTATGAACCA 2400 

GATGCCTCTG GAAGAAACTG TATTGACATT GATGAATGTT TAGTAAACAG ACTGCTTTGT 2460 

GATAAOGGAT TGTGCC6AAA CACGCCAGGA AGTTACAGCT GTAC6TGCCC ACCA6GGTAT 2520 

6TGTTCAGGA CTGA6ACAGA GACCTGTGAA GATATAAATG AATGTGAAAO CAACCCATGT 2580 

GTCAATG6GG CCTGCAGAAA CAACCTTGGA TCTTTCAATT GTGAATQTTC GCCCGGCAGC 2640 

AAACTCAGCT CCACAGGATT GATCTGTATT GACAGCCTQA AOGGGACCTO TTGGCTCAAC 2700 

ATCCAG6ACA GC06CTGTGA QOTGAATATT AATGGA3CCA CTCTGAAATC TGAATGCTGT 2760 

GCCACCCTCG QAGCCGCCTG GGGGAOCCCC TGTGAG0G6T GTGAACTAGA TACAGCTTGC 2820 

CCAAGA6GGC TTOCCAGGAT TAAAGGTGTT A06TGT6AAG ATGTTAATGA GTGTGAGGTG 2880 

TTCCCTG6C6 TTTGrCCAAA TGGACGCTGT GTCAACAGTA AGGGATCTTT TCATTGCQAG 2940 

TGCCCTGAAG 6CCTTACGTT GGAT66GACT QOCCGTGTAT GTTTGOATAT TOGCATGGAG 3000 

CAGT6TTACT TGAAGTGGQA TGAAGATGAA TOCATCCACC COSTTCCTGG AAAGTTCCGC 3060 

ATGQATGCCT GCTGCTGTGC TGT06G0G0Q GCTTGGGGCA C06A6TGTGA G6AGTGCCCC 3120 

AAACCT6GC31 COU^GGAATA CQAGACACTG TXKXXCOGOG GG6CTGGCTT T6CTAACCGA 3180 

OGGGATOTTC TTACTGGGCS GCCATTTTAC AAA6ACATCA ATGAATOCAA ASCATTTCCT 3240 

OGGATGTGCA CTTATGGGAA GTGCAGAAAT ACAATOGGAA GCT7CAAATG COGTTGCAAT 3300 

AGTGGCTTTG CTCTAGACAT GGAGGAAAGA AACTGCA0C3G ACATCGAOGA GTGCAGGATT 3360 

TCTCCTGACC TCTGTGGCAG TGGAATCTGC GTC3ATACAC CGGGCAGCTT TGAGTGOSAG 3420 

TGCTTOGAAO GCTATGAAAG TGGCTTCATG ATGATGAAGA ACTGCATGGA CATTGACGGA 3480 

TGTGAA08TA AOCCTCTOCT TT G T A GGGGT G6CA0CT6TG TGAACftCTGA GGGCA6CTTT 3540 

CA6TGTGACT GCCCACTGGQ ACAOGAGCTG TCACCATCCX: GTGAGGACTO TGTGGATATT 3600 

AATGAATGCT CCCTGAGTGA CAATCTCTGC AGAAATGGAA AATGTGTGAA CATGATTGGA 3660 

ACCTATCAGT GCTCTTGCAA TCCT6GATAT CAGGCTACGC CAGACCGCCA GGGCTGTACA 3720 

GATATTGATG AATGTATGAT AATGAACGGA GGCTGTGACA CCCAGTGCAC AAATTCAGAG 3780 

GOAAGCCAOQ AATOCAQCTG CA6TGAQ6GT TATGCCCTGA TGCCAGATG6 GAGAXCGTGT 3840 

GCAGACATT6 ATGAATGTQA AAACAATCCT 6ATATCTGTG ATGGC6G0CA GTGTACCAAC 3900 

ATTCCTGGAG AGTATCGCTG CXTCTGCTAT GATGGCTTCA TGGCTTCCAT GGACATGAAA 3960 

ACATGCaTTG ATGTCAATGA ATCTGACCTA AATTCAAATA TCTGCATGTT T6GGGAATGT 4020 

GAGAACAC3UI A6GGATCCTT CATTTGCCAC TGTCA6CTG6 GTTACTCAGT GAAGAAGGGG 4080 

ACGAGAGrar OTACAGATGT G6AT6AGTGT GAAATTGGTO CTCATAftCTG 0SACAT6CAT 4140 

GCCTC A TSTC T6AATATO0C AGGAAGCTTC AAOTGIAGCT GCAGAGAAGQ CTG6ATT86A 4200 

AACGGCATCA AGTGTATTGA TCTGGAOGAA TQTTCTAATG OAACCCACCA GTGTAGCATC 4260 

AATGCTCAGT GTGTAAATAC CCOGGGCTCA TACC3GCTGTQ CCTGCTCOGA AGGTTTCACT 4320 

GGTGATGGCT TTACCTGCTC AGATGTTGAT GAGTGTOCAG AAAAOVTAAA CCTCTGTGAG 4380 
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AA0SGACA6T GCCTTAATGT OCXX»»3T6CA TATOGCTGOS AGTGT6AGAT GGGCTTCACT 4440 

CCAGCCTCA6 ACAGCAGATC CTGCCAAGAT ATTGATGAAT GCTCCTTCCll AAACATTTGT 4500 

GTCTCTGGAA CATGTAATAA CCTGCCTGGA ATGTTTCATT GOiTCTGCtSA TGATGGTTAT 4560 

GAATTGGACA GAACAGGAGG GAACTGTACA GATATTGATG AGTGTGCAGA TCCTATAAAC 4620 

TGTGTCAATG 6CCTATGTG7 CAACACiGCCT GGTOGCTATG AGTGTAACTG CCOLCCOGAT 4680 

TTTCAGTTGA ACOQVACTGG TgTGG U l TC f GTT6ACAACC 6TGTGGGCAA CIGCTACXTTG 4740 

AAGTTTGGAC CTOGAGGAGA TGGGA G TCT G TCTTGCAACA GCGAGATCX30 6GTGG60CTC 4600 

AGTOGCTCTT CATGCTGCTG CTCTCTGQGA AAGGCCTGGQ GAAACCCCTO TGftGACATGC 4860 

CCCCCTGTCA ATAGCACTGA ATATTACACX: CTGTGTCCOG GAGGTGAAGO CTTCAGACCT 4920 

AA0CXX:ATCA CAATCATTTT AGAAGACATT GA06AAT6CC AG6AGTTACC AGGTCTCIGC 4980 

CAGGGTGGAA ACTGCATCAA CACTTTTGG6 AGCTTOOVCT GT6A6T6C0C ACAAGGCtAC 5040 

TAOCTCA6C6 AGGATACCXX3 OVTCTGTGAB GAXATTGATG AGTGTTTTGC ACATCCTGGT 5100 

GTOTGTGGGC CTGGGACCTG CTATAACACC CTGGGAAATT ACACCTGC31T TT6CCCACCT 5160 

GA6TACATGC AGGTCAATG6 AGGCCACAAC TGCAT6GACA T6AGAAAAAG CTTTTGCTAC 5220 

OGAAGCTATA ATGGAACCAC TTGTGA8AAT GAGTTGCCTT TCAATGTGAC AAAAAGGATG 5280 

TGCTGCTGCA CATATAATGT GGGCAAAGCT 6GGAACAAAC CTTGTGAAOC ATGCCCAACT 5340 

CCAGGAACAG CTGACTTTAA AACCATATGT GGAAATATTC CTGGATTCAC CTTTGACATT 5400 

GhCACAGGAA AAGCTGTTGA CATTGATGAA TGTAAAGAGA TTCCAGGCAT TTGT6CAAAT 5460 

QGT O TSTQCA TTAftCCAGAT TGGCAGTTTC CGCTGTGAAT GCCCTACAGG ATTCAGTTAC 5520 

AAT6ACCTGC TGTTGGTTTG TGAAGATATA GATGAGTGCA GCAATGGTGA TAATCTCTGC 5580 

CAGCGGAATG CAGACTGCAT CAATAGTCCT GGTAGTTACC 6CTCTGAATG TGCOGOGGGT 5640 

TTCAAACTTT CACCCAATGO GGCCTGTGTA GAT0GCAAT6 AATGTTTA6A AATTCCTAAC 5700 

GTTTGGA6TC ATGGCTTGTO TGTTGATCTG CAAGGAAGTT ACCAGTGCAT CTGCCACAAT 5760 

GGCTTTAAGQ CTTCTCAGGA CCAGACCATG TGCATGGATG TTQATGAGTG CGAGCGGCAC 5820 

CCATGTG6AA ATGGAACTTG TAAAAACACC. OTTGOATCCT ATAACTCTCr GTGCTACCCA 5880 

GGGTTTGAAC TCACTCATAA TAATGATTGC CTOGACATAG ATGAGTGCAG TTCCTTTTTT 5940 

GGTCAGGTGT GCAGAAATGG ACGTTGTTTT AATGAAATTG GTTCTTTCAA GTGTCTATGT 6000 

AACGAAGGTT ATGAACTTAC CCCftGATGGC AAAAACT6TA TAGACACTAA TGAGTGTth^ 6060 

OCCCTTCCOO OCTCTTGCTC TCCTG G T A OC TGTCAGAATT TGGAOGGATC CTTCAGATGC 6120 

ATCTGTCCCC CAGGGTATGA AGTAAAAA6C GAGAACTGCA TTQATATAAA TGAATGTGAT 6180 

GAAGATCCCA ACATTTGTCT TTTTGGTTCC TGTACTAATA CTCCAGGGGG CTTCCAGTGC 6240 

CTCT6CCCCC CTGGCTTTGT ACTATCTGAT AATQGA06GA GAT6CTTTGA TACTOGOCAG 6300 

AGCTTCTGCT TCACAAATTT TGAAAATGGA AAGTGTTCT6 TA C CCAAAGC TTTCAACAGC 6360 

ACAAAAGC3UI AATGCTGCT G TASTAA6ATG 0CA06AGA6G 6CTGGGGGGA CCCCTGTGAG 6420 

CTGT6CCCCA AAGAOGATGA AGTTGCATTT CAGGATTTGT GTOCATATGG CXMX3GAACT 6480 

GTCCCTAGTC TTCATGATAC ACGTGAAGAT GTCAATGAGT GTCTTGAGAG CCCAGGCATT 6540 

TGTTCAAATG GTCAATGTAT CAACACXX3AC GGATCTTTTC GCTGTGAATG TCCAATGGGC 6600 

TACAACCTT6 ACTACACTQO AGTAGGCTGT QTQQATACTG ATGA G T GT TC AATOGGCAAT 6660 

COGTGTOGAA ATGGTACATO CACXAATGTT ATTGGGA6TT TTGAATGCAA TTGCAATQAA 6720 

GGCTTTGAGC CAGGGCCCAT GATGAATTGT GAAGATATC3V AOGAATGTGC CCAGAACCCA 6780 

CTGCTGTGTG CTTTAC6CTG CATGAACACT TTTGGGTCXrr ATOAATGCAC GT6CC0GATT 6840 

GGCTATGCCC TCAGGGAAGA TCAAAAGATG TGCAAAGATC TGGATGAATG TGCTGAAGGG 6900 

TTACA06ACT QTGAATCTAG GGGCATQATG TOTAAGAATC TAATOGGCAC CTTCATGrOC 6960 

ATCTGCCCIC CTGQAATOGC 00QAAG6CCC QATGGAGAAG GCT8TGTAGA TGAAAATGAA 7020 

TGCAGGACCA AGCCAGQAAT CTGTGAAAAT GGAOGTTGTG TTAACATTAT TOGAAQCTAT 7080 

AGATGTOAGT GTAATGAAGG ATTCCAGTCA AGTTCTTCAG GCACTGAATG CCTTGACAAT 7140 

OSACAGGGTC TCTGCTTTGC AGAG6TACTG CAGACAATAT GTCAAATGGC ATCCAGTAGT 7200 

CGCAATCrOG TCACTAAOTC ASAATGCTOC TGTGATGGTG GG06AGGCT6 G6G0CACX»G 7260 

TGOSAGCTTT GCCCACTTCC TGQAACTOCC CAGTACAAAA AGATATGTCC TCATOGCCCA 7320 

GGATATACAA CTGATGGAAG AGATATTGAT GAATGTAAGG TAATGCCAAA CCTCTGCACC 7380 

AATQOTCAGT GCATCAATAC CATGGGCTCA TTCCGATGCT TCTGCAAQGT TGGCTACACC 7440 

ACAGACATCA GTGGAACCTC TTGTATAGAQ CTTGAT6AAT GCTCCCAGTC CCCGAAACCA 7500 

TGCAACTACA TCTGCAAGAA GACTOAOGGG AOTTATCAST GTTCATGTCC 6AGGGGGTAT 7560 

GTOCTGCAAG AG6AT8GAAA GACATOCAAA GACCTT6AT6 AAT8TCAAAC AAAGCA6CAT 7620 

AACTGCCAGT TCCTCTGTGT CAACACCCTG GGGGGGTTTA CCTGTAAATQ TCCACCTGGT 7680 

TTCACACAGC ATCACACTGC TTGTATCGAC AACAACGAAT GTGGGTCTCA ACCTTTGCTT 7740 

TGTG6AG6AA AG6GAATCTG TCAAAACACT CCAGGCAGTT TCA6CTGIGA ATGCCAAAGA 7800 

GGOTTCTCTC TTGATGGCAC COGACTGAAC tOTQAAGATO TTGATGAATG TGATGGQAAC 7860 

CACAGGT6CC AACACG6CTG CCAOAACATC CTGOGTGGCT ACA6ATGT6G CTGCOCCCAA 7920 

GGCTACATCC AGCACTACCA GTGGAATCAO TGTGTOGATG AGAATGAATG CTCCAATCCC 7980 

AATGCCTGTO QCTCTGCTTC CTGCTACAAC ACCCTGGGGA GTTACAAGTQ OGCXTPGCCCC 8040 

TCGGGGTTCT CCTTOSACCA GTTCTCCAGT GCCTGCCAOG ACXSTGAATGA GTQCTOSTCC 8100 

TCCAASAACC CCT6CAATTA C6SCT6CTCT AACA006AG0 QOaOCTACCT CTGTGGCTGC 8160 

C COCCTG G OT ATTACAGAGT GG6ACAAGGC CACTGTGTCT CAGQAAIGGG ATTTAACAA6 8220 

GGGCAGTACC TGTCACTGGA TACAGAGGTC GATGAGGAAA ATGCTCTGTC CCCAGAAGCA 8280 

TGCTACGAGT GCAAAATCAA GGGCTATCCT AAGAAAGACA GCA6GCAGAA GAGAAGTATT 8340 

CATGAAOCTO ATCCCACTGC TGTTGAACA6 ATCAGCCTA6 AGAGTGT06A CATGGACAGC 8400 

CCOGTCAACA TGAAGTTCAA CCTCTOCCAC CTC3 GGCT CIA AGGAGCACAT OCTGGAACTA 6460 

AGGC0060CA TCCAGCCCCT CAACAACCAC ATCOQTTATO TCATCTCTCA AGGGAAOGAT 8520 

GACAQOSTCT TCOSCATCCA CCAAA6GAAT GGGCTCAOCT ACTTGCACAC 660CAAGAAG 6580 

AAGCTCATGC C06GCACATA CACACTGGAA ATCACTAGCA TCCCTCTCTA CAAGAA6AAG 8640 

GAGCTTAAGA AACTGGAAGA GAGCAATGA6 GATGACTACC TCCTAGGGGA GCTTOG66AG 6700 

OCTCTGAQAA TGAGGCTGCA GATTCAGCTC TATTAAOCGT TCACAOACTT GQOOCCAGGC 6760 

TCAAATCCTA GCACAGCCAG TCTGCAGAAG CATTTGAAAA GTCAAGQACT AATTTTAAA6 6820 

AGGAAAAATA ATAATAACTC ri ty f TT CTTT CCTCCCTGTC TTAGACTTT6 AATGTTGACC 8880 

CTCACAGGGA GG6ATAATTT AGACTCTGGT ATG6CCAAAG ATTT6AGCTC AAAG6CAACC 8940 

GTGGTTACTG TATTTTTTAT ATAACTTCAT TTTAAAATAT ATTAAAAGAA ACCTAAA3GT 9000 

TCAAGATATC AGCATATGGC ACTAAATGCA CAAAAATAAT GTGAGCTTTT TTTTTTTTIT 9060 

CCTGTTAGCA GTCTGTAACA CTTTGGGTAT TTT6CTATAG TTGCTAATTA AAAAAATATA 9120 

GATGTTTATT TATTTTTAAT GCAGTAATAT ATQGAGAAAT 6AACAAACTA TGTAAACAAA 9180 

AAGGGAAACT CACTTGTTTT TCTTTAGATT TATAAATTTG AGCTATTTTT TTTAGAGGTG 9240 

CTTTTTAAAA ATCCAATAGA TACAAGAGAT GTTTOCTTTG GTTTTCTGCC AGTCATCCAG 9300 

CTGATACACA CCTGATOGAT TTTAAAGAAA GOCACACAGA GCTGAATOGG GCAGTGCTAA 9360 

TCAAXAATTT AAAAGACATG AATGTCATTA GATOCTTTAT AAOGTAGATC GAAGCOUJIQ 9420 

CAGCTCATTT GTGACAACAT TTCATATCAC CAGACACACC AGGCAACAGA AGTTGAAGCA 9480 

CAAOC3^CT6T MSCMAKIKC CTTGACTGCT TGTGAGACCA TTAGCATTGC AGGCCAAAOC 9540 

GTACrGTATT TCCTTCTCAT AACCTCAAGG AACCATATGT GCTACCCACA ACACCTCATT 9600 
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CTTACCCAGG GTGCGCTGOG TCCTCATGGT 
TTGAAAGGGA ACACCTGGCA TTCTGTGCTG 
ATTATGTTOl AGTTATTTCft GGATTGOCAT 
GGAATAZATO TlX^riVnG T TOTTTOWC 
TGTA6TTATA CACCATAT6C CTCATTTTAT 
ACAATGAATT GATGTTTAGT rrGCTTTAGT 
TATTAAGAGC AOGTATCCAT TATTCTTCTC 
CCAAACCTCA TATGTGAAAT G6CCAAAGCA 
CT6TGCTGAC CAAAGATTAO TAACCA6TTA 
TTAATAACTA AAAAAAAACT CGTOOC 



ACTGTAGGCA CCTGAAGAAC CGCCGTTCCC 9660 

•meaT G Cto tcttaaataa tggtgcattt 9720 

ATQTGCAAAC AAATCATGCA ATGCAGGCAA 9780 
CCATTTTTTT TTTAGAATTT TCATTAATAC 9840 
CATAGOCTAT TGTGTAT6AA AGAT6TTT8T 9900 
CATTTAAAAA GATATTGTAC CAOGATGTGC 9960 
AACOCAAQAA CCTGT y rCCT GGACCAGTGA 10020 
CAT6CAG6CT CCTGGTTGTT CCTCTCAAAC 10080 
TACOCAGTAT TTTCAGGTTT TATTGTTTrT 10140 



Seq ID NOt 459 Protein sequence 
Protein Accession #: MP_001990.1 



1 11 21 

I I I 

HGRRRRULQ liYFIMLGCW LHAQGTAGQP 
PEYREBGAAV ASKVRRRGQQ DVLRGPNVOG 
DGFCSRPKMC TCSSGQISST CGSKSIQQCS 
ENGCQMG6RC lAQPCACVYG FTGPQCERDY 
TTGRAWSPC a4CPAQFQPC RRGFIFNIRT 
RCPACSKQSB TTQKCEDIDS C8ZZPGXCET 
RTGMCPSGLV NGRCAQELPG RMTXMQGCXS 
PMGGIPGSAO SRPG6TGGNG PAPSOfCaiGY 
GPIZTGLTIIi NQTXDIC3CUH ANUXNGRCI 
NPCTNQDCVN TFOSYYCKOI AGPQRTPTRQ 
OrAOFBLTTD GXNCVDKDEC TTmMCLIIGM 
CQTPGIOOIG HCIHSEGSFR CDCPPGLAVG 
AVTK5BCCCA NFOYGPGEPC QPCPAKUSAS 
GZCEtlLRGSY RCNOfSGYEP DASGRNCXDI 
VPRTETBTCB DINBCBSHPC WtGACBOSOSnJS 
IQDSRCSVKZ NGATLKSECC ATLGAAHG5P 
FPGVCPKGRC WSKGSPHCB CPEGLTLDGT 
MDACCCAVGA AWGTECEECP KPGTKEYETL 
GKCTYGRCRH TIGSFKCRCN SGFAIiDMEER 
CPEGYBSCFM KMIENCNDIDG CERNPLLCBG 
SECSIiSDNLC RNGXCVNMI6 TYQCSOIPGY 
GSYECSCSEG YAU4PDGRSC ADZDECEmTP 
TCIDVNECDL NSNICMPGBC ENTKGSFICH 
ASCLI7IPGSF KCSCRBGHIG KGIKCIDLDE 
(SXSFTCSDVD BCABUZNIjCB NGQCUIVPGA 
VSGTCNNLPO NFHCICDDGy ELDRT6GNCT 
PQUIPTGVGC VDNRVGNCYL KFGPRGDGSL 
PPVNSTBYYT IK3»GGEGFRP NPITIILEDI 
YLSEDTRICE DIDBCPAHPG VGGPGTCXNT 
RSyHGTTCai BX.PFNVTKRM COCTYNVQKA 
BTGKAVDIDB CKBIPQZCAK OVCZNQIGSF 
QRHADCINSP G5YRCECAAG PKLSPNGACV 
GFKASODQTM CMDVDBCERH PCGNGTCKNT 
GQVCRNGRCF NEZGSFKCLC NB6YELTFDG 
ZCPPGYEVKS rarCXDZNECD EDPNlOiFGS 
SFCFTKPaiO KCSVPKAFNT TKAXCCCSKM 
VPSLBDTRED VZTECLESPGI CSNGQCZNTD 
PaafGTCTW IGSFEC3IQ7B GPEPGFMMKC 
GYALKEDQKM CKDLDECAEG LHDCESR6KM 
CRTKPGICEN GRCVNIIGSY RCECNEGFQS 
RNLVTKSECC CDGGR6W6HQ CELCPLPGTA 
NGQCINTMQS FRCPCKVOYT TDISGTSCID 
VLQOJGKTCK DhDECQTKQB NCQFLCVNTL 
CGGKOIOQNT PGSFSCECOR GFSLDATGUJ 
GYIQHYQKNQ CVDENBCSNP NAOGSASCYN 
SKNPOJYGCS NTEGOYLCGC PPGYYRVGQG 
CYBCKZNQYP KKDSRQKRSI HEPDPTAVEQ 
RPAIQPLNNH IRYVlSQCaiD DSVPRIHQRH 
ELKKLEE5NE DDYLLGELGE ALRMRLQIQL 



31 41 51 

i I I 

QPPPPKPPRP QPPPQQVRSA TAGSEGGFLA 60 

SRFHSVCCPO WKTLPGGNQC IVPICRNSCG 120 

VRCMNGOTCA DDHCQCQKGY IGTYCGQPVC 180 

RTGPCFTQVN KQMOQGQLTG IVCTKn^CCA 240 

GAOODVD60Q AZP6ICQGG27 CZNTVGSFEC 300 

65CSNTVGSY FCVCPRGYVT 8TD6SRCIDQ 360 

PGRCW6IGTZ PBACFVRGSB BYRRLCMDGL 420 

GPGGTGFIPI PGGtfGFSPGV G6AGVGAGGQ 480 

PTVSSyRCBC NMGYKQDAtn; DCIDVDECTS 540 

ACZDZDBCIQ NGVLCKNGRC VNSDGSFQCZ 600 

CZNEDGSFXC ZCaVGFVLAP NGRYCTDVDB 660 

MDGRVCVDTH MRSTCYGGIK KGVCVRPFPG 720 

PHGI.CSSGVG ITVDGRDINB CALDPDICAN 780 

DECLVNRW: DNGLCRNTPG SY8CTCPPGY 840 

SFHCBCSPGS RLSSTGIiZCZ DSIiHGTCHUS 900 

CERCELDTAC PRGLARZKQV TCEDVHECEV 960 

GRVCU3IRMB QCYLKHDEDE CIHFVPGKFR 1020 

CPRGAGPAHH GDVLTGRPFY KDINECKAFP 10 BO 

NCTDIDECRZ SPDL06SGZC VNTPGSFECB 1140 

GTCVNTBGSF QCDCPZi(SEL SPSREDCVDZ 1200 

QATPDRQGCT DZDECMXMHG GCOTQCTHSB 1260 

DICDG6QCTO IPGEYRCLCY DGFMASMDMK 1320 

CQLGYSVKKG TTGCTZ7VDEC EIGAHKCDME 1380 

CSHGTHQCSX HAQCVNTPGS YRCACSESFT 1440 

YRCBCEKGFT PASDSRSOQD ZDECSFQEnC 1500 

DZDECADPZN CVKGLCVNTP GRYEOTCPPD 1560 

SCSITEIGVGV 8RSSCCCSLG KAWGNPCETC 1620 

DEOQELPGLC QGGNCINTFG SPQCECPQGY 1680 

L6NYTCICPP EYNOVNGGEN OfDMRKSPCY 1740 

GNXFCEPCPT PGTADFKTZC GNZPGFTFDZ 1800 

RCECPTGFSY NDLLLVCBDZ DECSNGDNLC 1B60 

DRNECLEIPN VC5HGLCVDL QGSYQCICHN 1920 

VGSYNCLCYP GPELTHNKDC LDIDECSSFF 1980 

RNCZDTOECV ALPGSCSPGT 0QNLE6SFRC 2040 

CTNTPGGFQC LCPPGFVLSD NGRRCFDTRQ 2100 

PGB6KGDPCB LCPKDDBVAF QDLCPYGHGT 2160 

GSFRCBCPMO YBLDYTGVRC VDTOECSIGH 2220 

EDIMBCAQNP LLCALRCMNT FGSYECTCPI 2280 

CKNLXGTFMC ICPPGMARRP DGEGCVDENB 2340 

SSSGTECU2H RQ6UTABVL OTZOQNASES 2400 

QYKKZCPBGP GYTTDGRDZD BCKVMPNLCT 2460 

LDBCSOSPKP QJYICKNTBG SYQCSCPRGY 2520 

GGFTCKCPPG PTQHHTACID NNECGSQPLL 2580 

CEDVDECDGN HRCQR6C3C3NZ LGGYROGCPQ 2640 

TL6SYKCACP S6PSFDQPSS ACHUVNECSS 2700 

HCV8GMGFNX GQYLSLDTEV DEENALSPEA 2760 

ISIiESVDMDS PVHNKFIILSH L6SKEHZLEL 2820 

GLSYUHTAKK RLHPGTYTIiB XTSZPIiYKKX 2880 
Y 



Seq ID NO I 460 DNA sequence 

Nucleic Acid Accession ft: 2aM_013372.1 

Coding sequence: 63.. 617 



I 11 21 

I 1 I 

GOGGCCGCAC TCAGCX3CCAC G0GT06AAAG 
GTATGAQGGG CACAGCCTAC AOSGTGGGAG 
CGGCTOCTQA AGGGAAAAAG AAAGGGTCCC 
AGCACAATGA CTC3U3AGCAG ACTCAGTCGC 
GGGGCOUVGG GOGGGGCACT GOCATGCCCX3 
CCCTGCATGT GftOGGAGCGC AAATACCTGA 
AGCAGACCAT CCA06AGGAA G6CTGCAACA 
GCCAGTGCAA CTCTTTCTAC ATCCCCAGGC 
CCTQCTCCTT CTGCAAGCCC AAGAAATTCA 
AACTACAGCC AOCZACCAAG AAGAAGA6A6 
CCATOGATTT G6ATTAA6CC AAATCCA06T 
AGQAAGTCCC A6A0CTAAAA CAACCAGATT 
AAOCCCCAGC TGCCTCCTOQ CAGGAGCCTC 
ATGGGTGCCT G'i Ua/m ' m TTAGACACCA 
COCTATTTTO TAAACATATC TGCTTTAATG 



31 41 51 

I I I 

OGCAGGCCCC OAGGACOOQC OGCACItSVCA 60 

CCCTGCTTCT CCTCTTGGGG AOCCTGCTGC 120 

AAGGTGCCAT CCCCCX3GCCA GACAAGGCCC 180 

CCCAGCAGCC TGGCTOCAGG AACCGGGGGC 240 

GGGAG6AGGT GCTGQAGTCC AGCCAAGAGG 300 

AGOGAGACTG GTGCAAAACC CAGCC6CTTA 360 

GTOGCACCAT CATCAACOGC TTCTGTTAOG 420 

ACATCOGGAA GGAGGAAGGT TCCTTTCAGT 480 

CTACCATGAT GGTCACACTC AACTGCOCTG 540 

TCACAOGTGT GAAGCAGTGT OGTTGCATAT 600 

GCAC0CA6CA TGT0CTA6GA A76CA6CCCC 660 

CTTACTT66C TTAAA0CTA6 AG6CX3USAA6 720 

CTTGTGOGTA GTT0GT6T6C AT6AGTGTGG 780 

GA6AAAACAC AGTCTCTGCT AGAGAGCACT 840 

O66ATGTA0C A6AAACCCAC CTCACOCOGG 900 
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CTCACATCrA AAfiGGGGGGG GCOSTGGTCT OGTTCltSACT TTGTCTTTTT GT60CCT0CT 960 

GGGGMXAGA ATCTOCTTTC 66AATGAATG TTGATGGAA6 AGGCTCCTCT GAGG6CAAGA 1020 

GACCTcrrrr agtgcpgcat tcgacatgga aaagtccttt taacctgtgc ttccatcctc loso 

Cm'CCnX'f CCTCCTCACA ATCCATCTCT TCTTAAGTTG ATAGTGACTA TGTCAGTCTA 1X40 

ATCTCTTGTT TGCCAAGGTT CCTAAATTAA TTCACTTAAC CATGATGCAA ATGTTTTTCA 1200 

TTTTGTGAA6 AOOCTCCAGA CTCIG6GAGA GGCT Q GTOTG GGCAftGGACA AGCAGGATAG 1260 

TG6A6TGAGA AAOQGAGGGT GGAOGGTGAa GCCAAATCAG GTCX3U3CAAA AGTCAGTAG6 1320 

GACATTGCAG AAGCITGAAA OGCCAATAOC AGAACACAGG CTGATGCTTC TGAGAAAGTC 1380 

TTTTCCTAGT ATTTAACAGA ACCOJIGTGA ACAGAGGAGA AATGAGATTG CCAGAAAGTG 1440 

ATTAACTTTG GCOGTTGCAA TCTGCTCAAA CCTAACACCA AACTGAAAAC ATAAATACTG 1500 

AOCACTOCTA TGTT06GACC CAAGCAAGTT AGCTAAAC3CA MXXMCKC TCTOCTTTGT 1S60 

CCCrCAGGTG GAAAAGAGAG GTAGTTTAGA ACTCTCT6CA TA0GGGTGG6 AATTAATCAA 1620 

AAACCKCAGA GGCTGAAATT CCTAATACCT TTCCTTTATC GTGGTTATAG TCAGCTCATT 1680 

TCCATTCCAC TATTTCCCAT AATGCTTCTG AGAGCCACTA ACTTGATTGA TAAAGATCCT 1740 

OCCTCT G CTG AGTGTACCTG ACAGTAAGTC tAAAGATGAR AGAGTTTAGG GACTACTCTG 1800 

TTTTAGCMG ARATATTKTG OSGGTCTTTT TGTTTTAACT ATTGTaU^OA GATTOGGCTA 1860 

RAGAGAAGAC GA06AGAGTA AGGAAATAAA GGGRATTGGC TCrGGCTAGA GAGTAAGTTA 1920 

GGTCTTAATA CCTGGTAGAA ATGTAAGGGA TATGACCTCC CTTTCTTTAT GTGCTCACTG 1980 

AGGATCTGAG GGGACXXrTGT TAGGAGAGCA TAGCATCATG ATGTATTAGC IXSTTCATCTG 2040 

CTACTGGTTO GATGGACATA ACTATTGTAA CTATTCAGTA TTTACTGGTA GGCACTGTCC 2100 

TCTGATTAAA CTTGGCCTAC TGGCAATGGC TACTTAGGA7 TGATCTAAGG GCCAAAGTSC 2160 

AG6GTGGGTG AACTTTATTS TACTTTGGAT TTGGTTAACC TGTTTTCTTC AA6CCTGAGG 2220 

TTTTATATAC AAACTCCCT6 AATACTCTTT TTGCCTTGTA TCTTCTCAGC CTCCTAGCCA 2280 

AGTCCTATGT AATATGGAAA ACAAACACTG CAGACTTGAG ATTCAGTTGC CXSATCAAGGC 2340 

TCTGGCATTC AGAGAACCCT TGCAACTCGA GAAGCTGTTT TTATTTCGTT TTTGTTTTGA 2400 

TCCAGTGCTC TCCCATCTAA CAACTAAACA GGAGCCATTT CAAGGOGGGA GATATTTTAA 2460 

ACAOCGAAAA TGTTGGGTCT GATTTTCAAA CTTTTAAACT C3VCTACTGAT GATTCTCACG 2520 

CTAGGOGAAT TTGTCCAAAC ACATAGTGTG TGTGTTTTGT ATACACTGTA TGACXXCACC 2580 

CCAAATCTTT GTATTGTCCA CATTCTCCAA CAATAAAGCA CAGAGTGGAT TTAATTAAGC 2640 

ACACAAATGC TAAGGCAGAA TTTTGAGGGT GGGAGA6AA0 AAAAGGGAAA GAAGCTGAAA 2700 

ATGTAAAACC ACACCAGGGA GGAAAAATGA CATTCAGAAC CAGCAAACAC TGAATTTCTC 2760 

• nxfnxrmT aactctgcca caagaatgca atttogttaa tggagatgac ttaagttggc 2820 

A0CA6TAATC TTCTTTTAGG AGCTTGTAOC ACASTCTTGC ACATAA6TGC AGATTPGGCT 2880 

CAAGTAAA6A GAATTTCCTC AACACTAACT TCACTGG6AT AATCA6CA0C GTAACTACCC 2940 

TAAAAGCATA TCACTAGCCA AA6AGGGAAA TATCTGTTCT TCTTACTGTG CCTATATTAA 3000 

GACTAGTACA AATGTGGTGT GTCTTCCAAC TTTCATTGAA AATGCCATAT CTATACCATA 3060 

TTTTATTOGA GTCACTGATG ATGTAATGAT ATATTTTTTC ATTATTATAG TAGAATATTT 3120 

TTATGGCMG ATATTTGTGG TCTTGATCAT ACCTATTAAA ATAATGCCAA ACAOCAAATA 3180 

TGAATTTTAT GATGTACACT TTGTGCTTG3 CATTAAAAQA AAAAAACACA CATCCTGGAA 3240 

GTCTGTAAGT TGTTTTTTGT TACTGTAGOT CTTCAAAOTT AAGAGTGTAA GTGAAAAATC 3300 

TGGAGGAGAG GATAATTTCC ACTGTGTGGA ATGTGAATAG TTAAATGAAA AGTTATGGTT 3360 

ATTTAATGTA ATTATTACTT CAAATCCTTT GGTGACTGTG ATTTCAAGCA TGmTCTTT 3420 

TTCTCCTTTA TATGACTTTC TCT6AGTTG0 GCAAASAASA AGCTGACACA OCXSTATGTTO 3480 

TTAGAGTCTT TTATCTGGTC AGGGGAAACA AAATCTTGAC CCA6CTQAAC ATGTCTTCCT 3540 

GAGTCAGTGC CTGAATCTTT ATTTTTTAAA TTGAATGTTC CTTAAAGGTT AACATTTCTA 3600 

AAGCAATATT AAGAAAGACT TTAAATGTTA TTTTGGAAGA CTTACGATGC ATGTATACAA 3660 

AOSAATAGCA GATAATGATG ACTAOTTCAC ACATAAAOTC CTTTTAAGGA GAAAATCTAA 3720 

AATOAAAAGT GGATAAACAO AACATTTATA AGTOATCAGT TAATGCCTAA GAOTQAAAGT 3780 

AOTTCTATTO ACATTCCTCA ACATATTTAA TATCAACTOC ATTATGTATT ATOTCTGCTT 3840 

AAATCATTTA AAAACGGCAA AGAATTATAT AGACTATGAQ GTACCTTGCT GTGTAGGAGG 3900 

ATGAAAGGGG AGTTGATAGT CTCATAAAAC TAATTTQOCT TCAAGTTTCA TGAATCTGTA 3960 

ACTAGAATTT AATTTTCACC CX»ATAATGT TCTATATAGC CTTTGCTAAA GAGCAACTAA 4020 
lAAATTAAAC CTATTCTTTC AAAAAAAAA 

Seq ID NO: 461 Protein sequence 
Protein Accession ftt NP_037504.1 

1 11 21 31 41 51 

} I I I 1 i 

KSRTAYTVQA LLLLLGTLLP AABGK2CKG5Q GAIPPTOKAQ HNDSEQTQSP QQPGSRKRGR 60 

GQGRGTAMPG EEVLESSQEA LKVTERKyiiK RDHCKTQPLK QTIHEEGOIS RTIINRFCVG 120 

QCKSFYIPRH ZRKEE6SFQS CSFCKPKKFT TMNVTLNCPE LQPPTKKKRV TRVKQCRCIS 180 
ZDXiD 

Seq ZD NOi 462 DHA sequence 

Nucleic Acid Accession S: Eos sequence 

Coding sequencet 1..2733 

1 11 21 31 41 51 

III))) 

ATGAAAGTTG GAGTGCTGTG GCTCATTTCT TTCTTCACCT TCACTGAOGG CCACGGTGGC 60 

TTCCTG06GA AAAATGATGG CATCAAAACA AAAAAAGAAC TCATTGTGAA TAAGAAAAAA 120 

CATCTAGGGC CAGTOGAAaA ATATCAOCTG CTGCTTCAB3 TGACCTATAO AGATTCCAAO 180 

6AGAAAA6A0 ATTTGAQAAA TTTTCTGAAG CTCTTGAACC CTCOITIATT ATGGTCACAT 240 

GGOCTAATTA GAATTATCAG AGCAAAGGCT ACCACAGACT GCAACAGCXTT GAATGGAGTC 300 

CTGCAGTGTA CCTGTGAAGA CAGCTACACC TGGTTTCCTC CCTCATGOCT TGATCCCCAG 360 

AACTGCTACC TTCACA0G6C TGGAGCACTC CCAAGCTGTG AAT6TCATCT CAACAACCTC 420 

AGCCAGAGTG TCA A TTTCTG TGAGAOAACA AAOATTTGGO GCACTTTCAA AATTAATGAA 480 

AGGTTTACAA ATGACCTTTT GAATTCATCT TCTGCTATAT ACTCCAAATA TGCMMGGk 540 

ATTGAAATTC AACTTAAAAA AGCATAlt^ AGAATTCAA6 GTTTTGAGTC GGTTCAGGTC 600 

AQCCAATTTC QAAAIGGAAG CATCGTTGCT GGGTATGAAG TTGTTGGCTC CAGCAGTGCA 660 

TCTGAACTGC TGTCAGCCAT TGAACATGTT GCOGAGAAGG CTAAGACAGC CCTTCACAA6 720 

CTGTTTCCAT TAGAAGACGG CTCTTTC3USA GTGTTCGQAA AAOCCCAGTG TAATGACATT 780 

GTCTTTGGAT TTGGGTCCAA GGATGAT6AA TATACCCTGC CCT6CAGCAG TGGCTACAGG 840 

G6AAACATCA CAGOCAAGTG TGAGTCCTCT GGGTGGCAGG TCATCAGGGA GACTTGTGTG 900 

CTCTCTCTGC TTGAAGAACT GAACAAGAAT TTCAGTATGA TTGTAGGCAA TOCCACTGAG 960 

GCSUGCTOTGT CATCCTT06T GCAAAATCTT TCTGTCArCA TT06GCAAAA CCCATCAACC 1020 
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ACAGTGGGGA ATCTGGCTTC GGTGGTGTCG ATTCTCAGCA ATATTTCATC TCTGTCACPG 1080 

GCCAGCCATT TCAGGGTGTC CAATTCAACA ATGGAGGATG TCATCAGTAT AGCTCACAAT 1140 

ATCCTTAATT CAGCCTCAGT AACCAACTG6 ACAGTCTTAC TGOQGGAAGA AAAGTATGCC 1200 

AGCTCAOQGT TACTAGAGAC ATTAGAAAAC AltSUSCACTC 7G6T6CCTCC GACA6CICTT 1260 

CCTCTGAATT TTTCTC6GAA ATTCATTGAC TGOVAAGGGA TTCCAGT6AA CAAAAGCCAA 1320 

CTCAAAAGGG GTTACAGCTA TCAGATTAAA ATGTGTCCCC AAAATACATC TATTCCCATC 1380 

AGAGGCCGTG TGTTAATTGG GTCAGACXAA TTCXSiGAGAT CCCTTCCAGA AACTATTATC 1440 

AGCATGGCCT CGTTGACTCT GGGGAACATT CTACOOGTTT CCAAAAATGG AAATGCTCAG 1500 

GTCAATGGAC CTGTGATATC CAOCSGTTATT CAAAACTATT CCATAAATGA AGTTTTOCTA 1560 

TTrTTTTCCA AGATAGAGTC AAACCTGAGC CAGCCTCATT GTGTGTTTTG GGATTTCAGT 1620 

CATTTGCAGT GGAAOGATGC AGGCTtSCCAC CTAGTGAATG AAACTCAAGA CATCX3TGACX3 1680 

TGCCAATGTA CTCACTTGAC CTCCTTCTCC ATATTGATOT CACCTTTTGT CCCCTCTACA 1740 

ATCTTCCCCG TTGTAAAATG GATCACCTAT GTQGGACTGG GTATCTCCAT TGGAAGTCTC 1800 

ATTTTATGCC TGATCATOGA GGCTTTGTTT T6GAA0CAGA TTAAAAAAAG CCAAAOCTCT 1860 

CACACACX5TC GTATTTGCAT GGTGAACATA GCCCTGTCCC TCTTGATTGC TGATGTCTGG 1920 

TTTATTGTTG GTGCCACAjGT GGftCAOCAOG 6TGAACCCTT CTGGAGTCTG CACAGCT6CT 1980 

GTGTTCTTTA CACACTTCTT CTAOCTCTCT TTGTTCTTCT GGATGCTCAT GCTTGGCATC 2040 

CTGCTGGCTT ACXXX»TCAT CCTOGTGTTC CATCACATOG CCCAGCATTT GATGATGGCT 2100 

GTTGGATTTT GCCTGGGTTA TGGGTGCCCT CTCATTATAT CTGTCATTAC CATTGCTGTC 2160 

ACGCAACCTA GCAATACCTA CAAAAGGAAA GATGT6TGTT GGCTTAACTG GTCCAATGGA 2220 

AGCAAACCAC TCCTGQCTTT TOTTGTCCCT GCACTGGCIA TTGTG6CXGT GAACTTOGTT 2380 

GTG6TGCTGC TA6TTCTCAC AAA6CTCTGG AQGGC3GACTG TTGGGGAAAG ACTGAGTOGG 2340 

GAT6ACAAGG CCACCATCAT C0GC6TGGGG AAGAGCCTCC TCATTCTGAC CCCTCTGCTA 2400 

GGGCTCACCr GGGGCTTTGG AATAGGAACA ATAGTGGACA GCCAGAATCT GGCTTGGCAT 2460 

GTTATTTTTG CTTTACTCAA TGCATTOCAG GGATTTTTTA TCTTATGCTT TGGAATACTC 2520 

TTGGACAOTA AGCXGOGACA ACTTCTGTTC AACAAGTTGT CTGCCTTAAG TTCTTGGAAO 25B0 

CAAACAGAAA AGOUUUVCTC ATCAGATTTA TCTGCCAAAC CCAAATTCTC AAAGCCTTTC 2640 

AACCCACTGC AAAACAAAGO CCATTAT6CA TTTTCTCATA CTGGAOATTC CTOOQACAAC 3700 
ATCATGCTAA CTCAGTTTGT CTCAAATGAA TAA 

Seq ID liOt 463 Protein sequence 
Protein Accession #i Eos sequence 

1 11 31 31 41 51 

I I i 1 I I 

MKVGVLHLIS FFTFTD6KOG FUGKMD6XKT KKELXVKKKK EL6FVEEYQL LLQVTYSDSK 60 

EKROLRNFIiK LLKPPLIiHSH 6LIRZIHAKA TTDOtSUIGV LQCTCEDSYT HFPPSCLDPQ 120 

HCYWTAGAL PSCECEHiNNL SQSVNFCBRT KINGTFKIHE RFTNDLLNSS SAIYSKYANG 180 

IBIQLKKAYB RIQGFESVQV TQFRHGSIVA 6YEWGSSSA SELLSAIEHV AEKAKTAUiK 240 

LFPLEXXSSFR VFGKAQOIDI VF6FGSRDDB YTLPCSSGYR OIZTAKCBSS GWQVIRETCV 300 

LSUiEELMRM PSMIVGNATB AAVSSFVOHIi SVZZIIQNPST TVGNIASWS ZLSNI8SLSL 360 

ASHPRVSNST MEDVISZATOI ILHSASVTNH TVLLREEKYA SSRLIiETLEH ISTLVPPTAL 420 

PLNPSRKPID WKGIPVNKSQ LKRGYSYOIK MCPQNTSIPI RGRVLIGSDQ FQRSLPBTII 480 

SMASLTLGNI LPVSKNGNAQ VNGPVISTVI QNYSINEVFL FFSKIESNLS QPHCVFWDFS 540 

HLQMNDAGCa LVNETQDIVT CQCTHLTSFS ILMSPFVPST IFPWIWITY VGIiGISIGSL 600 

ILCLZZEALP WXQZKKSQTS HIRRZCNVtfZ ALSLLZADVH FZVOATVDTT VNPS6VCTAA 660 

VFPTRPFYLS LFPWHLMLOZ LLAYRZZLVP HHMAiQHLMMA VGECLGYGCP LIISVZTZAV 720 

TQPSNTYKRK DVCWLKWSNO SKPLLAFWP ALAXVAVNFV WLLVLTKLH RPTVGERLSR 780 

DDKATXZRVG KSLLILTPLli GLTWGFGIGT IVDSQHZAWH VZFALIiNAFQ GFFII«CPG1L 840 

WSKLRQLLP NKLSALSSWK QTEKQNSSDZi SAKPXFSKPF NPLQNK6HYA FSffTQDSSDS 900 
ZMbTQFVSMB 

Seq ZD NOi 464 DNA sequence 

Nucleic Acid Accession ft: AB035089.X 

Coding sequences 9845.. 10219 

I 11 21 31 41 51 

I I I I I I 

GGGCATOCAG OCATCGGGGA AAATCCATAG TX3CAGATAAA GCAAGGAGGA A6AAGAAGGA 60 

CAGrrCTAGT AAAAGGGAGA ACATCAATAT AGGATGTTTC TTAGCAATAG AAAAAGAAGG 120 

CCAAGAGGAA TTAGGGAGAG AGTTATAAGA GATCAGCAAG GGGACAGGGT TAGATTTG6T 180 

TTGGTTTGAA AGCATACAGT AAATATGATG TCPGTCCCTG GCAGTGTTGG CAGAGTAGGA 240 

AGGAGGAAGG GAGGCAAGAG ATAATATCAT TTTCTCTGTO CTCCAACTGT ACTTACATAT 300 

GAGACTATTT CCCTCTCTGC TTTTCAAACC TTACTQGAGT TGTTTTCCCT CATGAAAACC 360 

AAGAAAGGAA AOCTAGTTAO TCTTGTTCTG AGGTTGTTCA ATGTATACAT ATCTATATCT 420 

GTAGACAGAA TCCTTGGGAA TACAGTAATT GACATATATT CTGTTATTTG ATGCTTGAAA 480 

AATCTCCTCC ACTAACCAGT TTCCCTATAG ATTGCCACAA GCACATAATA AGAAACAATA 540 

AATAAAATGT TCTCTTGACT TTGTTACTTA ACAATGCTQA GAAAACTITA CAGCCTTCAT 600 

AA6GAAGTGA GGTCCAGQAA AATCTAGGAO ATATTTCTTA ACCAATCTAT AAAGGCATTA 660 

GTAATGACAG GATATTTCCT GAAAGTGTAA TTTCCCATTG AGGATTTGTT TTTAATTTCT 720 

GGATTCCTGQ AGCCAATGAA GTTGGTGTAT GTTTATGAAA TATCAAGAGA CATAAGTTGG 780 

CAAGTGTTCA TATGCAAAAA CTTCTTGGAA TTTCTGAGTT CTCTGTGGCA ATATATQACA 840 

TCAGGATATG TOCAGTCTCA CACACCAGGA TATGTCCTTT CTAGCCTGTC TATCACATOC 900 

TAGGAGAACT ATTTAGOAAC AGAAAAAAAT 6CCTGAAATG ATTTCTCATT TGAACTCATC 960 

CAAGCTTTCT CTAAATTTAA QCAAACTCCT GGTCATTTTC AGTTAGTACC TTTCCTTAAG 1020 

TTCAACCTTC AGGGCAAACC TCCGTGCCTC AGACGTTTAG CCATAGTCTG AAATTC TCTT 1080 

CCATAGATTG GTCCCCTGTA ACCCXX5GTTT GTCTCAGCTT GTTATCCTGT TTTTTTCTTC 1140 

CCTOCATTCC CAGGATGAGC r i-Grmfri C TGTCCTATGA GACAT TAGA T TCXTPTTTCTT 1200 

TG6TACCCGA GTAAATCCAT OCTACTCCAA TAGAOOAAGG TCCATTTTTG TCTTATAGCG 1260 

CTGGATGCAG ACTCAGCTGA GAAGACCATT ATTCATTTTr GQAATTCTTT ATCTCAGATA 1320 

TTTCCTCTTC TTTCTTTTTC TTCTATCTTT GGATTTTTA6 TCCATCAAOG COCCATTAGT 1380 

CTATTCCCOG ACTTCAATCA GGGAACTTAT ACCTCTTAAA CTCAnCAGA GACTCAAAAC 1440 

ATAXATATT6 ATACAG6AGA CCTAAGAAGA GCATGTCTTQ UUUUl'TGAGG AAACAGGCAG 1500 

GTGftGAAATT TCCAQATTGG AAACACAGCT TCCTTTCTCC CAT CCAG CCC CTACTTTCAG 1560 

CCTATGTGTT TCTOGCACCT TQTTGTAGAT AAATCTCCCT TGACTTTGTG ATGTGCTGAG 1620 

AAAACAAACT CAOGQCTGGT 6TTAAAAAG6 GCCCATGACA ATACCAAGTG TTGGGQAGAA 1680 

TGTGGAGAAA TCAGAACTCT ATTCAOGGTC GGTTGGAATG CACACTTOTO CAGAATTCTA 1740 
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TGGAGAAGAG TCTGGCATTT CCTCRAAA.TG TTRAOCTGGA TTTACCATAT GACCCASOGA 1800 

TTTCATTCAT AOGTTTATAC TCAAAAfiAAA TGAAC3WVTA TGCCATGCAA AAAAATGTAC 1860 

ATGAAAGGTC ACAACATCAT TATTCATAAT AGTAAAAGGA TGGAAACAAC ACAAATGTCC 1920 

ATCAACTTAT GATTAAAGAA AATCTGGTCT ATTCATAGAA TtSGAATATTA TTCGACCACA 19 BO 

AAAAGGAATG ATGTACTGAT CCATGCAATG ATCTGGACAA ACCATGAAAA TAA CACTAG A 2040 

TTAAAGAAGC CAGTCACAAA AGGACTrACT GTATGATTCC ATTTAOCTGA AATGTTTGGA 2100 ' 

ATA6GCAAAT CCATAC3AAAC AGGAG6TA6A ' tVO^njam CCAGGGTCTC CAGGAACGSA 2160 

AGAATGAAGT ACAAGATTTC TTTTGGAGGT AGTGAAATTG TTGTGGAAT6 AGATCATGAT 2220 

GATtSATAGCA CAACTTTGTG AATATAATAA AATCATTGAA TTGTACAGTT GAAT TIATg 2280 

TATATAAATT ATATGTTAAT AAAAAGGGQG TCCACAAAAC AAACA6CCCC CX^CTCTGGT 2340 

TGTCAGGGAG ATATTGGATT AAATGGCCTT GGACAACAAC COCTCTCCCT GGC CftCAG AC 2400 

ATTCTTCAGA TTACAAGATA TTCCAGGGGA AAdCTGGAA TGAGTCTGAA GC CAGGT GCT 2460 

AAACASAAG6 ACCATTGAGA AATGTTGTGA TCCTGACAGG TCAAGCAATT TATTTTTOSG 2S20 

CrrCATTTTT AAATGTAAAA TTAGAAAGCT GCCATTTAAA ATCGCC OSTC TGTTTCAATT 2580 

GCTCTTCTCA GTGTCAGCCT GTTAACTCAA TGTGTTAGTC TGTTTTCATG CTG CTGA TAA 2640 

AAAC3^TACCT GAGACTC3GCA AGAAAAAGAG GTTTAATTGG GCTTAGAGTT CCA0GT6ATT 2700 

GGGGAGGCCT CAGAATCACA GTAGGAGGCA AAAGTTATTC TTACATGGT6 GCTGCAAGA6 2760 

AA6AT6AGGA AGAAGCAAAA GAA6AAACCC CTGATAAACC CATCGGATCT CCTGAGGCTT 2820 

ATTAACTATC ATQAQAATAO CACAAGAAAG ACCGGCCCCC ATGATTCAAT TAC CTCTACC 2880 

TGGGTCCCTC OUITAACATG TGGAAATTCT GGTAGATACA ATTCAAGTTG AGATTTGGGT 2940 

GGGAACACAG CCAAACCATA TCACTCAGCA AGGCAGATAA CTTTCTCACT GAGCCTATGC 3000 

AACAGAAAAC CATCTGGGAT GGrTGTAAGG GGCAGA6GAA GTGACTGGTA GGATCACTGC 3060 

CAAAGCTGAO CACTCAGGAG AAGQCAATAG AATCCTATTC TCCATAGTAT GCTATAAGAT 3120 

ACrOAAGTAC ACTTCrTCAC TATCTCTTTG GACTTAGAAT TAGCACTACA TTCCTTGTTA 3180 

TACAGAAAAA TTACTAAGGA AATTCATAGG ATGACAAAAA CTTTCAGAAC TGAAAAACAG 3240 

GAAATGTAAG CTTTTTAGTT CTTTGGTATT CGAAGTATGC CTAAAAGACA ATGCAAAATC 3300 

CAAGAAAAGA ATGGTGGGGT TTTTGTTTGT TTGGTTTTGT TTTTGTTTTA CAGCTGGAGT 3360 

AGAATACAAA G6GATGC3AGT TGAAACAAAT * GAGAGGAAAT TGGAAtTCTA aAcTTATTCT 3420 

CATTGGCATT AQAAAOQCAC CTACATGTAT TTCACATGAO COGGTGACTG CTGACTTGCA 3480 

TTCTTATTTT TTCCCTATAG ATTAAAAAGG AGGTACAATG GTA GAAC TGT AATCCTGTCC 3S40 

TTTGTCATAA ATTTTCATAT TCATAAAGGT GAGTGTTAGC COGCTTGTGA AATCTGAAGT 3600 

TGAGTAACTT CAAATACTAA CCACAGAGGO AAAGGCA6CA AGAGGAGAGG CATAAATTTA 3660 

OGATCTCACC CTTCATTOCA CAGACACACA CAOCCTCTCT GCCXACCTCT GCTTCCTCTA 3720 

GGAACACAGO TAAGAGCTTC AAGCXTPCTCC ASCTTAATAA CATGAATTAT TTT TGAGA AT 3780 

AATAATGATA CTGTGTTCTA TATCATGCAT CTCCTGCATT CTGTCT GATT ATATTTTACT 3840 

TATTCTGCCA GAGCAAAATT AAAATACCTA TTTCATCTGA rrTGTCCTTT ATCTA AATTG 3900 

CTXAGTTCCA A6TAAACCAA GGCACTTTTA GGAACACAGA QGGAGAGTGC CTTGCAGCCA 3960 

GAGAGTCTTG AAGGAGATGT CAGG6ACGGA TCTTAACAQC TGGTTG GATG TGATCCACAG 4020 

A0G T CTCCT6 TTAGCATTCA TTGTAAAGCC ATCCTACCTA GCTCTAGTGT AACCAGCAAT 4080 

GAAAGAAAGA TAAAGAGGGT CXSATTACTTA TTTACAATAG TCTTTAAAAA CGTAGTTTTG 4140 

TAAGCCTTCT AATTAGGACA TTAATATATT TAATATATGC ACATTGTAGA AAGATTGAAG 4200 

C3GTTAAAAAT AA6AGAAAAA CTTTAAATGT CAAAATCTCA CAACCCAGAT ATATCATTTC 4260 

TTXAAGAAAA TTGTACTACA AAATACCATT OCATTTATTA AAOTCATTCT GACAGGAATC 4320 

TGATGCTTTT CCAGGAGTTC CAGATCaCAT OGAGTTCACC ATGAATTCAC TCAGTGAAOC 4380 

CAACACCAAG TTCATGTTOQ ATCTGTTCCA ACAGTTCAGA AAATCAAAAG AGAACAACAT 4440 

CTTCTATTCC OCTATCAGCA TCACATCAGC ATTAGGGATG GTCCTCTTAG GAGCCAAAGA 4500 

CAACACTGOV CAACAAATTA GCAAGGTA6C TATCAGCATC ATTAOGTTGT CCTGTTGCAG 4560 

TTTTTCTCTO GTTCOGTOQG CTAGCaOGCA GATGGTAATA QATOTGGTGO TCTGATGGGT 4620 

AGCACAGGGG GCTOTGCAOG AATTCCCATA ACTQTGAOAC CACTGACTTA AACAGATCTT 4680 

TTGAGTAAAG TTTTCTTGTC COGCTTCATG TCTCTTCCAG GTTCTTCACT TTGATCAAGT 4740 

CACAGAGAAC ACCACAGAAA AAGCTGCAAC ATATCATGTG AGTCACAGAG CACTCTGATT 4800 

CAOCTTTAQA TCCCTGAACA 6GTCATAGTT TAAACCTGGA ACTTCACAAA AACTAAGAAA 4860 

AGGCCAGTTT TAGGGAAAAT CTTGGACACA AAGATTQAGA CATACAGAGT QGGTT GGCT.T 4920 

TTCATGGCAC ATAATTATTA TTCCTCATTT CTOCOTTACT AAAA6ACA0T CAGC ACTGT A 4980 

CCTCAGAGCA TAGGTCTGGA TCAGGATAGG CTGGGTTCAQ ACTCCAGCTT TGCTCTTCAC 5040 

AAATGATGAA TAAGAGCAOO ACACAACTGC TCGGAGTCCC AGTGACCTCA TCCCAGAAAA 5100 

CTAAGGGTAA GAAAAAATCT GACTCAATAC ATGCAAATAC ATCCAAATGT TTACAACAGT 5160 

GCCTTGCCCa TAAAAGTCAT AATAAATGTT ATTATTATTA TAAAGTAGCT ATAAITATAC 5220 

TAATCATAAT AATGTGAAAA TAATTTAATT TTCATTGAGT CATTAATGAO ATTCAGAGGA 5280 

ATAAGCACAA GTCCAAGTAT ATTTTGGAAA ATGATTGCTA TGGAATATAT TQGTTTAGAG 5340 

CCTTAATAGT GCAAAATGCT TTGCTGGAAG GTAGAAAGTT CTAGATTTAA ACAGGCTTAG 5400 

GTTCAAAACT TGGCACTTCT AATTTATGTC TCTATAAACA GGGTTTTTTT CCCCATTCTC 5460 

TGAGCTTTCT TGTGTTCATC TGAATT6AAC TAAAGACTTA GAGTTACOCA TOTAAAGTCC 5520 

TTAGCCATGQ ACCTGGCATA CACTCTTCTT AOGTGCAGAO AATOAOCATC ATGAGGAAA6 5580 

AGCCACAGAT CAGTCAATGT GTCCTACAAG ATAATAQCAC CAACAGGTAT AACAGGGCTT 5640 

CCTGGCATAA TCTATTTAAA ATATCCAACC TTCAACATAC TCGTATCCTT GATGACTGTT 5700 

AGAAOTQAAA TATGGTCCTT GCCCATAAGG A6CTGAGAGT TTAACTGGGA AGCTAAAOCT 5760 

AACCCTTTAA ACCAACAAGG AGAAAATCTA CTOGTAGACA GOQC TQCATC TTTAOTTCAO 5820 

AAGAGAAAAG ATTGCAGTAC GTTAGAGCAA GAAtaATTTT CTGQAAGAAG TCAAATATAA 5880 

GGTGGATTTT GAAGGGTATT TGAGGTGAAA TACACCAATT ATCAQGGAAT AACATCAAAG 5940 

GTCCTCAATG AGACTACC3VG CATTTAGGGA CTGATCTAAC AGACTTAGCA TGGGTTTAGT 6000 

ATTTACATTG ATACAGCAAT TGAATGATCT CCTTTTTTGA TGTTTGAAGG TTGATAGGTC 6060 

AGGAAATGTT CATCACCA6T TTCAAAAGCT TCT6ACTGAA TTCAACAAAT CCACT GATOC 6120 

ATATGAGCTG AAGATCGCCA ACAAGCTCTT CG6AGAAAAG AOSTATCAAT TTTTACAGGT 6180 

AATTTGACCT C6CCTACCCA CATTTCATTT GCATOCTGAT GTCPGTGTCT CTQAGTGGCC 6240 

AAATGGAAGA AAGCAAGGCA GATGAGCCTG GCOGACCCAG GTGGAGAGCA TTTACTCAGA 6300 

GTOCATTAGC TCC3VTTTCCA CAACTCTCCC CCACTGGA6T 6TCCCAGACC CCAAOGATAC 6360 

ATCACTGAAG TGTGGATTTA GGGATAATCT TGTGATAAAA GAQSAQBTTO TGTAATAGAG 6420 

TQAGTAAGA6 TAATAAGTAA TAAGATACCA TCGATAAACT GQCACTGACT CAGTCACATA 6480 

OSATACATCT TQGTOGGAAA TGTATGACTA ATGGGATATT ATTOGAATGG GCAGGCTTGG 6540 

GTGAGTTCCT GAGAATA6TT GA6GAAGTAC CAGGAAATAT TGAATGCACA GGATGAAAGA 6600 

CAAAAACAAA GATCAGAAAC ATCATGGTTA AAATTACTGG AGAGAAGTCT GAGAAGCAAT 6660 

GAATCTCCTT CAGGGAAGOC TGCTCTGC3VG TTTGCAAACC ACAGCCTCTT CTGCTTCTGC 6720 

CTTTTOCCAA GATGATATTG ACCTTCAGTG ACCTCTTTCT TGTGCCAGCC CACATTCCCC 6780 

TTTT6CATTG CCTACATGAC ACCTGTATAA AAATATCCAT GGA CAGG AGA TACTGCATCT 6840 

ATTCAG6GTC TGGATTCAGC TTACTOTTG?! TACAAATAAO TAAGTTTGGT AATATATAGT 6900 

TACATAAATT ACTCCTAATT CXSACTTCTT OCTTCATATC TCAAAGGAAT ATTTAfiATGC 6960 
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CATCMGAAA TTTTAOCAGA CCAGT6TGGA ATCTACTGAT TTTGCAAATG CTCCAGAAGA 7020' 
AAGTOSAAAG AAGATTAACT OCTGGGTGGA AAGTCAAA06 AATGGTAiGGA GAGCCACXXA 7080 
TTATAGAAAC ACCTTTGAGA AACCTATGCC AGTGAGCCTT GTGCTTGACA CTGCATGGGG 7140 
GAACAGGTGT 6G6(SITTGA6 ATGGGTTTGC A6GGAGGGCT GAAGAGGGCA CTCCAGATGA 7200 
AGGATTTGTC CAAATGAATA T6AAGAGAGC CTAGGGGAGC CAAGGAQGAA ATCACAGGAA 7260 
GCCAATTAGA 7GGAAACACA TCTGGAGAAT TATTTGCITA TCGOOCTGCA TGACAATAGC 7320 
rrTGTGGATC OCCTGTCTCC 6CTCAGACCT ATTTTGACAT CATATOCTTT ACTTTAAATC 7380 
AGACTCAAAT TTTTATGATG AATATTTAAT AGAAAACATT AGAAAGOSTC TCTCGTCTCC 7440 
TTTACTAArr GGGAAACAAG CAGCTCTCTG GTAAATCACC CTTTTGTCTC TGAGCTGGAG 7500 
CTGCCTGGAT CACATCT6TA GCCAATGTGT TCTGCAGGGA TTATCACAGC TCTCTTCCOC 7560 
ATCAAOGGCA AAGA6CTTGA CSUUtfSTCTCC ATTCTACA6A CATCTTTCTT ACCICOCAOC 7620 
TCTCATTACA GGCCAAACTT ACAGCAACTC AACATGAGA6 TGAATAGGAA GATACCCC06 7680 
GAAGTAGTGT CTGACAGCAC AGGACATGOG TTTCATATTA CAGAGCTCAA GTCACTCATC 7740 
CTAAAATGCA ATCAGGGCCT CCTTCCTCTG AATGGGGACC CXS3TAGTTAA AAAAAAATAA 7800 
AAOTAGGAAS AGGAGGGAGG GAGAAAGGAA A6ACACATGT TGGAAGAGTA GACAAAATCA 7860 
GTTTATCAST ATTCCAAATC AGATGATTGQ AGACATTCAT ACACAGAGAA 0GT6AACTGC 7920 
TTCTCTATCA CAAGAA6TGA TGTCTCCATC AAGGGTAACT TTATAC3QACT GGAGOCTTGA 7980 
A6AAAGCTGC ATCT06TGAA CCACT6GTCA GTGAGTCTAA CAATTCAAAG ATCAAAGTCA 8040 
GTGAGTCTCA AGCAGGGATT TOGGTCAATA ATTAACGATC AGTCAOGAAC ATTTGCAAAG 8100 
CATCrrCCAG ACAA6CCATT TGTAGCTTGT CTAAAAGACT CTTTTATTCT TTCCCTTGCA 8160 
GAAAAAATTA AAAACCTATT TCCZGATGGO ACTATT6GCA ATGATAOGAC ACIGGTTCTT B220 
GTGAAC6CAA TCTATTTCAA AGGGCAGTGG GAGAATAAAT TTAAAAAAGA AAACACTAAA 8280 
GAGGAAAAAT TTTSGCCAAA CAAGGTATTG TCTATATTTT ATTTATATAG TGTAATATGT 8340 
TAATACATGG AATGTTAAAC ATTTCTGATO GAATGTAACA TGATAAGTAA AAAATAAAAA 8400 
TTGTTCATGT CTGTTATTTT GTTGTTTTAC TCTTATAACT TTATTTAGTT AGGAATACCT 8460 
GAAAAACIAT TGT7TCTAAC TCATGGAATT CCTGG6TTAT TTCTTAGAAG AAGAAGGATO 8520 
TGTTGCTATC TCAATAATAT TATCTTTTTT GTCTTGTGTT TCAOGTGTTA TTT6TTGGAC 8580 
ACATTGATTT ATTGCAGAAT ACATACAAAT CTGTACAGAT GATGAGGCAA TACAATTCCT 8640 
TTAATTTTGC CTTGCTGGAG GATGTACAGG CCAAGGTCCT GGAAATACCA TACAAAGGCA 8700 
AAGATCTAAG CATGATTGTO CTGCTGCCAA ATGAAATCGA TGGTCTGCAG AAG GTAAG AA 8760 
CTTGCATCTA CAACTCTTCC TTCTACTOCC OOACATTTTT OCAAAGATAC CAAGTTTAAA 8820 
CAAGGTAAAA GCTTATGACC GAGTIGCCTC AAAAT6ATGA AAAATTCTAA ATGAG6AATG 8880 
ATGACTCACC TTCATATTAC AAATATTTGA GCATACS3GCC TGACACAAAC TGAAAGCTTA 8940 
GTTTTTGTTT GTTTGTTTGT TTTTATTATT ATTATTATAA TACTTTAAGC TTTAGGGTAC 9000 
ATGTGCACAA TGTGCAGGTT AGTTACATAT GTATACATGT GCCAT6CTGG TGTGCIGCAC 9060 
CCATTAACrC ATCArTTAGC GTTAGGTATA TCTCCTAATG CTATCCCTOC OOOCTOCCCC 9120 
CACCCCACAA CAGTCCTCAO AGTGTGATGT TACCTTCCTG TGTCCAAGTG TTCTCATTGT 9180 
TCAATTCCCA TCTATGATTT AATTCCATCT ATGGCTTAGT TAATGATTAA TTTATTAGAG 9240 
TTACATGCAT TGGATATCAA TTTGATGATA TTATTATGCA GCAATTTAAA CTTGACTGGG 9300 
AGAAATATAT ACCAATGTGA GGAAAGTTTA CAAATAGGCC GAGTAGAAAA GGGAATACAA 9360 
ATTTAGGAAT TTAGGGAATT ACAATTTAAT AATTGCAATG TGTACTAAAT AATGTATACA 9420 
GAAAAATATO ATGAGCXTTAT TAAAAATTGA CACATGTAGT AGGCTGTTGG CACAAGAAAT 9480 
AGTGATACAT ACAGTTCATT GTGTACAAAA TAATGTAATC ATATTTTACA TGTGTATCAT 954 0 
ACAGTTGTAT ACATACATAT GTACACATAT ACATATAOGT AAAAACATGA TTCTGTTTTT 9600 
ACATACATGT ATATACATAT ACACATATAA CCCAATGTAT TTATATATTC AGGACTCATA 9660 
TTTTACCTAT TAGAATAATA ATGTCTATTA AAGTGAA<XT TCTGTATTTC ACATTTATTG 9720 
CCAAAATAAC GAATCTCCAC ATAGTCAATT CATTGTTAAG GTGTATTAGA GATOGACAGT 9780 
7AGTCATATC AGTTTCTTTT TTCCATTTGT ATAGCTTGAA GA6AAACTCA CTGCTGAGAA 9840 
ATTGAT6QAA TG6ACAAGTT T6CA6AATAT GAGAGAGACA TGTGTCGATT TACACTTACC 9900 
TCGGTTCAAA ATGGAAGAGA GCTATGACCT CAAQGAGAC6 TTGAGAACCA TGGGAATGGT 9960 
GAATATCTTC AATGGGGATG CAGACXTTCTC AQGCATGACC TGGAGCCAOG GTCTCTCAGT 10020 
ATCTAAA6TC CTACACAAGG CCTTTGTGGA GGTCACTGAG GAGGGAGTGG AAGCTGCAGC 10080 
TOCCA008CT GTA3TA0TAG TOOAATTATC ATCTCCTTCA ACTAATGAAG AGTTCTGTT6 10140 
TAATCACOCT TTCCTArPCT TCRTAAGGCA AAATAAGACC AACAGCATCC TCTTCTATGG 10200 
CAGATTCTCA TCCCCATA6A TGCAATTAGT CTGTCACTCC ATTTAGAAAA TGTTCACCTA 10260 
GAGGTGTTCT GGTAAACTGA TTGCIGGCAA CAACAGATTC TCTTGGCTCA TATTTCTTTT 10320 
CTATCTCATC TTGATGATGA TAGTCATCAT CAAGAATTTA ATGATTAAAA TAGCATGCCT 10380 
TTCTCTCTTT CTCTTAATAA OCOCAC^TAT AAATGXACTT TTGCTTCCAO AAAAATTTCC 10440 
CTTGAGQAAA AATGTOCAAG A7AAGATGAA TCATTTAATA C0GTG7CTTC TAAATTTGAA 10500 
ATATAATTCT GTTTCTGACC TGTTTTAAAT GAACCAAACC AAATCATACT TTCTCTTCAA 10560 
ATTTAGCAAC CTAGAAACAC ACATTTCTTT GAATTTAGGT GATACXTTAAA TCCTTCTTAT 10620 
GTTTCTAAAT TTTGTGATTC TATAAAACAC ATCATCAATA AAATAATQAC ATAAAATCAT 10680 
rm - UCYATA CCTGTTTTCT CTCTaOAAAO GGCAAOTOTC CAGTTACACA TAGGAAAfiAT 10740 
AATTTA6A6A TATATTAATC ATATATAAAG GAAAATTAAA AACAGAGTAG TTGATGATGA 10800 
GCCTGQAGTA GAAGGCATAT CCCA6AACAG GAGGAGOCTT GTAAACCACA TAGGAACTTC 10860 
CTATTTTATG CTAAAGGGAT AAGAAACTCA TTACAGGCTT T6ATG0TTGT TTGTCAAAGA 10920 
GG66CATAAA ATTATCATAT CCACATCTAG AAAATACATC TCTGGCTAOG CTGATATCAA 10960 
TGGATGOQAG GAAAGAACAG T6T6GTTA0C ATATATAAAT TA6GAAATCA TTAGAGTATT 11040 
GGGAGTGGAA ATG6A6AGAA AGAAAGAGOC TGGGG6AATT ATTTA6GAAA TAAT AGTTA C 11100 
AGAAAGACAT CTAAGTTOCT GACCTATCTG ACTGGATGGA TGGTiAGAATA TCTTGTTTCT 11160 
GAGAGAAAAA AAGACTTTQG GTTTAAATTT GTACTTGATG AATTAAGGTA CTTTTAATAT 11220 
TCAAATG6AT TTGCCTGGCA GGCACTTGAA GATATTAGTC TAAATCTCAO AAACAGAATA 11280 
TGATCTGAAG CTCIAAATTT GTGATATTCA ATATAAATAC TTTAGAGTCA TT6G6ATAAA 11340 
TATGGTAGTT GTAGCTAAAA GCAAAAATAA QATACTAG6G AGAAAGQATA AAGTTAGAAG 11400 
AAAGAAGAAT CTAGAATTGA CCTTQAAGTA TATCA6CATG TGTAAA6ATC ASGAATT6AT 11460 
CATTTTTATT TTOCAOAAAQ TAGCTTTTCT TAGGGTTCCA TATTTACPOC CATAGATTCT 1152 0 
TCOC 

Seq ZD HOi 465 Protein seguence 
Protein Accession BAB2 1525.1 

1 11 21 31 41 51 

I I I I i I 

MNSLSEAZmC FMFDLFIQOPR KSKEKHXFYS PISZTSALGM VLX/SARDNTA QQISKVLBFD 60 

QVTENTTEKA ATYHVDRSCai VHHQPQKLLT EPMKSTDAyB LKIAKKLPGE BCTYQPLQBn* 120 

DAIKKPYQTS VESTDPANAP EESRKKINSW VESQTNEKIK NLPPDGTIGEI DTTLVLVKAI 180 

YFRGQWEI7EF KKEHTKBEKP WPNRMTYKSV QMMRQinfSFZ' FAZiLEDVOAR VZiEZPYKGKD 240 
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LSNIVUiRIB ID6LQKLEBK LTAEKIMEHT SLQHMRETCV DLRLFRFIMB BSYDLKDTLR 300 

TNGNVNXFKG DADLSGMTHS EQLSVSKVttB KAFVBVTBE6 VEAAAATAW WELSSPSTM 360 

EEPCOJHPPL FPIRQNKTNS ILPYGRPSSP 

Seq ZD NO: 466 DNA sequence 

Kuclelc Acid Accessioa St NM_001910.X 

Coding sequence t 50.. 1240 

1 11 21 31 41 SI 

I I I I I 1 

GGAGAGAA6A AAG6AG60G6 CAAGOQAGAA OCTOCTO G TC QQACTCACAA TGAAAAG6CT 60 

CCTTCTTTTG CTGCTGGTGC TCCTOGAGCT GG6AGAGGCC CAAGGATCCC TTCACAGGGT 120 

GCXXXTCAGG AGGCATCOGT CCCTCAAGAA GAAGCTGOGG GCACGGAGCC AGCTCTCTGA 180 

GTTCTGGAAA TCCCATAATT TGGACATGAT CCAGTTCACC GAGTCCTGCT CAATGGACCA 240 

GAGTGCCAA6 GAACCCCTCA TCAACTACTT GGATATGGAA TACTTOSGCA CTATCTCCAT 300 

TGGCTCCCCA CCACAGAACT TCACTGTCAT CTTOGACACT GGCTCCTCCA ACCTCTGGQT 360 

CCCCTCTGTG TACTGCACTA GCCCAGCCTG CAAGAOGCAC AGCAGGTTCC A6CCTTCCCA 420 

GTCCAGCACA TACAGCCAGC CAGGTCAATC TTTCTCCATT CAGTATGGAA CCGGGAGCTT 460 

GTCCX3GGATC ATTGGAGCCG ACCAAGTCTC TGTGGAAGGA CTAACCGTGG TTGGCCAGCA 540 

GTTTGGA6AA AGTGTCACAG A6CCAGGCCA GACCTTTGTG GAT6CAGAGT TTGATGGAAT 600 

TCTGGGCCTO GGATACCCCT CCTTGGCTGT GGGAGGAGTG ACTCCAGTAT TTGACAACAT 660 

GATGGCTCAG AACCTGGTGG ACTTGCC3GAT GTTOTCTGTC TACATGAGCA GTAACCCAGA 720 

AGGTGGTGCG GGGAGCX3AGC TGATTTTTGO AGGCTACGAC CACTCCCATT TCTCTCGGAG 780 

CCTGAATTGG GTCCCAGTCA CCAA6CAAGC TTACTGGCAG ATTGCACTGG ATAACATCCA 840 

GGTGGGAGGC ACTGTTATGT TCTGCTCCGA GGGCTGCCAG GCCATTGTGG ACACAGGGAC 900 

TTCCCTCATC ACTGGCCCTT COGACAAGAT TAAGCAGCTG CAAAACGCCA TTGGGGCAGC 960 

CCCCGTGGAT GGAGAATATG CTffTGGAGTG TGCCAACCTT AACGTCATGC C3GGATGTCAC 1020 

CTTCACCATT AAOGGAGTCC CCTATACdCT CAGCCCAACT GCCTACACCC TACTGGACTT 1080 

06TGGATGGA ATGCAGTTCT GCAOCAOTGO CTTTCAAGGA CTTGACATCC ACCCTCCAGC 1140 

TGGGCCCCTC TGGATCCTGG OGGATGTCTT CATTCGACAG TTTTACTCAG TCTTTGACOG 1200 

TGGGAATAAC OGTGTGGGAC TGGCCCCAGC AGTCCCCTAA GGAGGGGCCT TGTGTCTGTG 1260 

CCTGCCTGTC TGACAGACCT TGAATATGTT AGGCTGGGGC ATTCTTTACA CCTACAAAAA 1320 

GTTATTTTCC AGA6AATGTA 6CTGTTTCXA 06GTT6CAAC TTGAATTAAC AOCAAACAGA 1380 

ACAT6A6AAT ACACACACAC ACACACATAT ACACACACAC ACACTTCACA CATAGACACC 1440 

ACTCCCACCA CX33TCATGAT OQAQGAATTA OGTTATACAT TCATATTTTG TATTGATTTT 1500 

TGATTATGAA AATCAAAAAT TTTCACATTT GATTATGAAA ATCTCCAAAC ATATGCACAA 1560 

GCAGAQATCA TGGTATAATA AATCCCTTTG CAACTCCACT CAGCCCTGAC AACCCATCCA 1620 

CACAOGGOCA GGCCTGTTTA TCTACACT6C TGCCCACTGC TCTCTCCAGC TOCACATGCT 1680 

GTACCTGGAT CATTCTGAAG CAAATTCOGA GCATTACATC ATTTT6TCXA TAAATATTTC 1740 

TAACATOCTT AAATATACAA TCGGAATTCA AGCATCTCCC ATTGTCCCAC AAATGTTTGG 1800 

CTGTTTTTGT AGTTGGATTO TTTGTATTAQ GATTCAAGCA AGGCCCATAT ATTGCATTTA 1860 

TTTGAAATGT CTGTAAGTCT CTTTCCATCT ACAGAGTTTA GCACATTTGA A06TTGCTGG 1920 

TTGAAATOOC OAOGTGTCAT TTGACATGGT TCTCT6AACT TATCTTTCCT ATAAAATGGT 1980 

AGTTAGATCT GGAGGTCTGA TTTT6TGQCA AAAATACTTC CTA6GTG0T0 CTGGQTACT7 2040 

CnCTTGCAT 0CT6TCAGGA G6CAGATAAT GCTGGTGCCT CTCTATTGOT AATGTTAAGA 2100 
CTGCTGGGTG GGTTTGGA6T TCTTGGCTTT AATCATTCAT TACAAAGTTC AGCATTTT 

Seq ZD NO I 467 Protein sequence 
Protein Accession ffi NP_001901.1 

1 11 . 21 31 41 51 

I t I I I i 

MKTLLLUiLV IiLBLGBAQGS ZiKRVPLRRBP SLKKKLRARS QLSEFWKSBN LDI4ZQFTESC 60 

SMDQSAKBPli ZNYLDMEYFG TZSIGSPPQN FTVZFDT6SS NLWVPSVYCT SPACKIHSRF 120 

QPSQSSTYSQ PGQSFSIQYG TGSLSGIIGA DOVSVEGLTV VGQQPGESVT BPGQTPVDAE 180 

FDGZI^GLGYP SLAVGGVTFV FONMMAQNLV DLPMFSVYMS SNPEGGAGSB LZFGGYDHSH 240 

FSGSLNWVPV TKOAYHQIAL X3NZQVGGTVM FCSBGOQAIV DTGTSLITGP SDKXKQLQNA 300 

ZGAAPVDGBY AVECANLNVM PDVTFTZNGV PYTLSPTAYT LLDFVDGMQF CSSGFQGIiDZ 360 
BPPAGPLNZL GDVFZROFYS VFDR613KRVG LAPAVP 

Seq ID NO I 468 DNA sequence 

l^cleic Acid Accession #: llM_0ie058.1 

Coding sequence : 319 . . 1575 

1 11 21 31 41 51 

I I I I I t 

TAC60GCIGC GGGACCG6CA GGGGAAOSCC ATCGGGGTCA CA6CCT6CGA CATGGACGGG 60 

GA0GGCC6GQ AGGAGATCIA CTTCCTC A AC ACCAATAA3Q CCTTCTC6GG GGTGGCCAGG 120 

TACACOGACA AGTTGTTCAA OTTCCGCAAT AACCGGTGG6 AAGACATGCT GA GOGAT GAG 180 

GTCAACSTQG CCOGTGGTGT GGCCAGCCTC TTT6C06GAC 6CTCTGTG6C CTGTGTGGAC 240 

AGAAAGG6CT CTGGACGCTA CTCTATCTAC ATTGCCAATT ACOCCTACGO TAATGTGGOC 300 

CCTGATGCCC TCATTGAAAT GGACCCTGAG 6CCAGTGACC TCTCC06GGG CATTCTG60G 360 

CTCAGAGATQ TG6CTQCIGA GGCTGGOOTC AOCAAATATA CAGGGGCSCGQ AG606TCAGC 420 

GT6GG0CCCA TCCTCAGCAG CAGT6CCTOG GATATCTTCT GOGACAATGA GAAT6G6CCT 480 

AACTTCCTTT TCCACAACCG GGG0GAT6GC ACCTTT6TGG ACGCTGCG6C CAGTGCTGGT 540 

GTGGAOSACC CCCACCAGCA TGGGC6AGGT GTCGCCCTGG CTGACTTCAA CCGTGATGOC 600 

AAAGT66ACA TCGTCTATG6 CAACTGGAAT GGCCCCCAOC OCCTCTATCT 6CAAATGA6C 660 

ACCCA TGGGA AOGTCGBCTT OOO QGACAJ C OCCTC A CCCA AGTTCTCCAT GOOCTCOC CT 720 

GTCOGGAOGO TGATCAOCGC 0GACTTT6AC AAT6A0CAG6 AGCTGGA6AT CTTCTTCAAC 760 

AACATTGCCT ACCGCAQCTC CTCAGCCAAC CGCCTCTTCC GCGTCATCOQ TAGAGAGCAC 840 

GGAGACCaX TCATCGAGGA GCTCAATCCC GGCGAOGCCT TOGAGCCTGA GGGCCGGGGC 900 

ACAGGGGGTG TGGTGAC06A CTTOGAOGGA GA03G6ATGC TQGACCTCAT CTTGTCCCAT 960 

G6AGAGTCCA TGGCTCAGCC CCTGTCOSTC TTCOQOGGCA ATCAGG6CTT CAACAACAAC 1020 

TGGCTGGGAG TGGT6CCA06 CAC006GGTT GG6GCCTTTG CCAG66GA0C TAAGGTOGTG 1080 

CTCTACACCA AGAAGAOTOG GGOCCACCTG AGGATCATCG ACGGGGGCTC AGGCTACCTG 1140 

TGTGAGATGG AGCCCGTOGC ACACTTTCGC CTGGGGAAGG ATGAAGCCAQ CAGTGTGGAG 1200 

6TGACGTGGC CAGATGGCAA GAT GG TGAGC 0GGAA0GTG6 CCAOO GGG GA GATGAACTCA 1260 
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GTGCTS6AGA TCCTCTAOOC OOSGGATCAG GACACACXTC 
ACACCAATGT^ ATGCATCCAO TTCCCATTCG TB TO OCCTCG 
ACACCTATGG AAGCTAC3VGG TGCCGGACCA ACAAGAAGTG 
ACGAGGATQG CACAGCCTGC GTGGGGACTC TCGGCCAGTC 
CCCCCACXX»: TGCTGCT6CC ACTGCOC S CTG CTGCTGCOGC 
CA0C6GTCCT CGTAGATG6A 6ATCTCAATC TQGGGTGQGT 
CCAGCTOCTG A6CAGG6GT0 G6ACATGAAC CA606QATGG 
Ai\GT06GCTT GTGCTGCTGC CTAGACAGTA 6GGATGTAAA 
CCCAAOOCCA TCCATGCACA TTACTTAGCT AACAATTAGG 
CTGTGCTGGG CACATA6CTG TGATCACAGC AGACAGGGTC 
ATTOCAGTQG GTCZAATGAC CATATCTTAG GACACAGATQ 
CTGCACAGGA AGTATGAGGA CTTTAGT6TC CTGA6TTCAA 
AAAGCTATGT GACCTTACAC CACSTCACTtA ACTTGTTAGC 
AAATGOGGAT TAAGAATAOA ATCTTOGGGT TAQTGTGGAG 
GACACTTCGC ACAAAACCTG GCACATAGTA AAGGCTCAAT 
GOGCTTTGTC AACAOGTO 

Seq ZD HOt 469 Protein sequence 
Protein Accession #i NP 060528.1 



X 
I 

MDPEASDLSR 
RGDGTFVDAA 
FROZASPKFS 
EtiKPGDALEP 
RTRVGAFARQ 
KMVSSNVAS6 
GAGPTRSAVG 



11 
I 

GILAIiRDVAA 
ASAGVDDPHQ 
MPSPVRTVIT 
EGRGTGGWT 
AKWLYTKKS 
BMNSVLEZLY 
ATSPTRMAQF 



21 

I 

EAGVSKYTGG 
HGRGVALADP 
ADFDNDQELE 
DPDGDGMLDL 
GAELRIZDGG 
PRDEDTIjQDP 
AHGLSASHRA 



31 
I 

RGVSVGPILS 
NRDGKVDIVy 
IFPNNIAYRS 
ILSHGBSMAQ 
S6YI«Ca4EFV 
APLETPKNAS 
PAPPPPPULIi 



AG6A00Q\GC 
A6ACAAGCCC 
CAGT06GGGC 
ACCGGGCCCC 
TGCTGGAGCT 
GCTTAAOGAO 
A0T0CA6CAG 
GGCCTGGGAG 
GAGACTOGTA 
GCTGCOCTGA 
T600QU3GGA 
ATCXTGATTC 
CATCCATTAT 
ATTAGATTAA 
AAAAACAAGT 



41 

1 

S5ASDZPCDN 
OIWHGPKRLy 
SSANRLPRVI 
PL5VFRGHQG 
ARFGL6KDEA 



PLPTiTiTiPUiK 



CCCACIGSAfi 
GTATGT6TCA 
TACGA6CCCA 
CGCCCCACCA 
GCCACTGCTG 
AOCZGOGftGC 
CGQAGTCGGA 
CTAGACCCTC 
AGGCCAGGCC 
TGGCXCTTAC 
GQT6GTGTCA 
AGOU^CTCAC 
GGCATCT6CA 
ATGTATGT7A 
GCCTCTCACT 



51 

I 

ENGPNFIiFHN 
LQMSTHGKVS 
RREHGDPLIB 
PNNNWLRWP 
SSVEVTWPDG 
PYVSTPMEAT 
IiPUiHRSS 



PCTAJS02/12476 



Seq ID NO: 470 DNA sequence 
Nucleic Acid Accession ft: AJ279016 
Coding sequence t 1..1962 



1 
I 

ATGTCCAQGA 
CAGOG06CTG 
A6TAATCCCA 
TTTGAGATOG 
CAGAAGCGGC 
GACOGGCAGG 
GAGATCTACT 
TTGTTCAAGT 
OGTGGTQTGG 
GGACGCTACT 
ATTGAAAT6G 
GCTGCTQAGO 
CTCAGCAGGA 
CACAACCGGG 
CAOCAGCATG 
GTCTATGGCA 
OTOGGCTTCC 
ATCAC06C00 
CX3CAGCTCCT 
ATCGAGGAGC 
GTGACOGACT 
GCTCAGOOGC 
GT60CA0GCA 
AAGAGTGGGO 
COOGTOGCAC 
GATGGCAAGA 
CTCTAOCX^ 
TTCTOCCAGC 
GTGTQOOCTC 
AACAAGAAGT 
CTG6GCCAGT 
GCTGCTGCCX5 
CrGG Q OTOQ G 
CCAGOQGATG 
AGGGATGTAA 
TAACAAT7AG 
CA6ACAGGGT 
GGACACAGAT 
CCTGAGTTCA 
AACTTGTTAG 
TTAGTGTGGA 
AAAGGCTCAA 



11 
I 

T6TTACXX3TT 
AACCCATGTT 
CCCAGCTCAA 
T06TGG0GG6 
TGGTGAACAT 
G6AACXSCCAT 
TCCTCAACAC 
TCa5C3UlTAA 
CCAGCCTCTT 
CTATCTACAT 
ACCCTGAGGC 
CIG0G0TC3\O 
0T6CCT0G6A 
GOSATGGCAC 
GGCGAGGTGT 
ACTGGAATG6 
Q GGACA TOGC 
ACTTT6ACAA 
CAGCCAACXX3 
TCAATCCCGG 
TCGACGGAGA 
TGTC06TCTT 
CCC6GTTTG0 
CCCACCTGAG 
ACTTTGGCCT 
TGGTGAGCOG 
GGGATGAOGA 
AGGAAAAT06 
GA6ACAA6CC 
GCAGTOGGGG 
CACCGGGCCC 
CTGCTGGA6C 
TGGTTAAGGA 
GAOTCCAGCA 
AGOCCTGGGA 
GGA6ACT0ST 
CGCTGCCCTG 
GTGCCCAGGG 
AATCCTGATT 
CCATCCATTA 
GATTA6ATTA 
TAAAAACAAO 



21 

I 

CCTGCTGCTG 
CACTGCAGTC 
CTATGGTGTG 
GTACAATGGA 
OGCGGTOGAT 
06GGGTCACA 
CAATAATGCC 
CCGGTGG6AA 
TGCOGGACGC 
TGCCAATTAC 
CAGTGACCTC 
CAAATATACA 
TATCTTCTGC 
CTTTGTGGAC 
CGCCCTGGCT 
CCCCCACCGC 
CTCACCCAAO 
TGACCA66A6 
CCTCTTCCGC 
06AOGCCTTG 
OGGGATGCTG 

GGCCTTTGCC 
GATCATCGAC 
GGGGAAGGAT 
GAACGTG6CC 
CACACTTCAQ 
CCATTQCATG 
CGTATGTGTC 
CTAOGAGCCC 
COGCCCCACC 
TGCCAC76C7 
GAGCTQCGAG 
GGGGAGTGGG 
GCTAGACCCT 
AAGGCCAGGC 
ATGGOGCTTA 
AGGTGGTGTC 
CAGGAACTCA 
TCGCATCTGC 
AATGTATGTA 
TGOCTCTGAC 



31 

I 

CTCTGGTTTC 
ACCAACTCAG 
GCAGTTACT6 
CCCAACCT6G 
GAG06CAGCT 
GCCTGGGACA 
TTCTGGGGGG 
GACATCCTGA 
TCTGTGGCCT 
GCCTA06GTA 
TCCO GGQGCA 
GGGQOCOGAG 
GACAAT6AGA 
GCTG0G6CCA 
GACTTCAACX: 
CTCTATCTGC 
TTCTCCATGC 
CltSGAGATCT 
GTCATCOGTA 
GAGCCTGAGG 
GACCTCATCT 
CAGOGCTTCA 
AGGGGAGCTA 
GGGG6CTCAG 
GAAGCCAGCA 
AGOSGGGAGA 
GACCCAGCCC 
GACACCAATG 
AACACCTATG 
AACX3AGGATG 
AiCCCCGACCG 
GCAC0G6TCC 
COCAGCTGCT 
AAAGTGGGCT 
CCCCAAGCCC 
CCTGltfCTGG 
CATTCCAGTG 
ACTQCACAGG 
CAAAGCTATG 
AAAATGGGGA 
AGAC ACTTGQ 
TGQGCTTTGT 



41 

1 

TGCCCATCAC 
TTCTGOCTCC 
ATGTGQACCA 
TTCTGAAGTA 
CACCCTACTA 
T06ACGGGGA 
TGG0CACX3TA 
GOGATGAGGt 
GTGTGGACAG 
ATGTGGGCCC 
TTCTGQCGCT 
GOGTCAGOGT 
AT6G6CCTAA 
GTGCTGGTGT 
GTGATGGCAA 
AAATGAGCAC 
OCTOCOCTGT 
TCTTCAACAA 
GAGA6CACGG 
GCCQOGGCAC 
T6T0CCATGG 
ACAACAACT6 
AGGTOGTGCT 
GCTACCrGTG 
GTGTGGAGGT 
TGAACTCAGT 
CACTGGAGTG 
AATGCATCCA 
GAAGCTACAG 
GCACAGCCTQ 
CTOCTGCTGC 
TGGTA6ATGG 
GAGCAGGGGT 
TGTGCT6CT6 
ATCCATGCAC 
GCACATAGCT 
GGTCTAATGA 
AAGTATGAGG 
TGACCTTACA 
TTAAGAATAG 
CACAAAACCT 
CAACA06 



51 

I 

T6AGGGGTCC 
TGACTAT6AC 
TGATGG(S3AC 
TGACGGGGCC 
OGCGCTGCGG 
CGGCCGG6A6 
CAC06ACAA0 
CAAGGTGGCC 
AAAGGOCTCT 
TGATGCCCTC 
CAGA6ATGTG 
GGGCCCCATC 
CTTCCTTTTC 
GGA06ACCCC 
ACTGGACATC 
CCATGG6AAG 
CCGCAOQGTC 
CATT60CTAC 
AGACCCCCTC 
AGGGGGTGTG 
AGAGTCCATG 
GCTGOGAGTO 
CTACACCAAG 
TGAGATGGAG 
GACGTOGCCA 
6CTG6AGATC 
TGGCCAAGGA 
GTTCCCATTC 
GTGCCGQACC 
CGTGGGGACT 
CACTGCOGCT 
AGATCTCAAT 
GGGACATGAA 
CCTAGACAGT 
ATTACTTAGC 
GIGATCACAO 
CCAXATCTTA 
ACTTTAGTGT 
CCAGTCACTT 
AATCTTGGG6 
GGGACATAGT 



Seq ZD NO: 47X Proteitt sequence 
Protein Accession fti CAC06451 

1 11 21 31 41 51 

i I I I I I 

MSRMLPFXiLL liHFLPZTBQS QRAEPMFTAV THSVLPFDYD SNPTQLNYGV AVTDVDHDGD 
FBZWAGmG FNXiVUODBA QKRLVKZAVD E&SSPYYALR DRQGHAIGVT ACSIDGDGRE 
BIYFUmnOk FSQVATYIDK LFKFRNNRHB DZLSDBVNVA RGVASLFAGR SVACVDSK6S 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 
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1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
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GRYSIYIMnr AlQHVOIDAIi ISMDPBftSDIi SHOIIALRDV 
LSSSASDIFC BliaiOFMFLF HHSODelFVS tMStOVBD? 

vytanoiGPRR lylomstugk vkfrdiaspk fsmpsfvrtv 

RSSSANKLFS VIRRE8GDPL lEELNFGDAIi EPEGSGTCGV 
AQPbSVPBGN QGFIINNHI.RV VPRTSFQAFA RGAKWLYTK 
PVMIFGIiaXD BKSSVBVTHP DGIMVSRIVA Saa«SVI>BX 
FSQQEHGHCM OTNBCIQFPF VCVSDXPVCV RnOSXKOar 
LGQSPGPRPT TPTAAAATAA AAAAAGAATA APVLVOSDUI 

Seq ID NO: 472 DMA sequence 
Nucleic Acid Accession tt FGBIESHH 
Coding sequence i 1..47M 



PCTAJS02/12476 



ATG6C6TGTC 
AGGGQCTCCT 
GTTCTGAAGT 
TCACCCTACT 
ATCGACGGGG 
CACAGCAGCT 
CCACCTACAA 
TCCTCCCTGG 
TGTOGGGGTG 
GGGGTGGCCA 
CTGAGOGATS 
GCCTGTGTG6 
G6TAATGTG6 
GGCATTCTGG 
TTCTCCCACA 
GGAGGAGACC 
T600QGCTGG 
CA6AG0QAG6 
TGCAAAA6CC 
G06CCTTCTC 
OCCCTTGTCA 
CCCCACCCCC 
CTGATQ6CTO 
CTGAGAAGCT 
GAGCTGGGAd 
CTGGGAGAAC 
CCCAAGGTCA 
GGCCCGGGGA 
CTCTCCCATC 
GTGC06GGAG 
CTGGCGTGGA 
TTTAGGCTCA 
CIGCAGTTCC 
TCTGCCAiCTC 
ATCCrCAGCA 
TTCCACAACC 
GCCTTCATOG 
CXAOCAGAAA 
CCACATT6CC 
TTCTTGAOGC 
CAGGGGGCCC 
ACTGCCTATT 
TTGTGCTCT8 
GCCCTGGCIG 
CCCCACOGCC 
TCACCCAAGT 
GACCAG6A6C 
CTCTTC0GA7 
GGTGAGGGA6 
AAG6TCAACA 
AGAGGCT6T0 
AAAGGGAAGG 
OCACACTAOC 
GTOCAATCAC 
OQGQGTCCAA 
GCTAC6G6CT 
AGGQ6CTA03 
A(aVAAGGQGC 
CCAOGAAAAB 
ACTACCAOGA 
AATCACTACC 
GTCCAATCAC 
08GGGTCCAA 
GCTATG6GGT 
AGG6GCTATG 
AGAGAGCACO 
6GCCGGGGCA 
TT6TCCCATG 
AACAACAACT 
AA3GT06TGC 
GGCTACCTGT 
AGTGTGGAGG 



11 
I 

CGGGAGGACT 
CCCCAGCATC 
ATGAOOGGGC 
AOGGGCTGCG 
ACGGCOGGGA 
CAGOCCAGGT 
CCCCTGCAGG 
GTCAGGCTTC 
GACTGAGACC 
CGTACACCGA 
AGGTCAACGT 
ACAGAAAGGG 
GCCCTGATGC 
C6CTCAGAGA 
CTGCCTCTCC 
CA6A0SA66C 
GCIGGAAGGA 
CTXSGGGCAGC 
ATTTGGCTGA 
CAGCCCACXX: 
CTCAGCTAAT 
GA6CCCCAGG 
AG0CTTT666 
GG6AGGAAA6 
GTCCCTGGAG 
CTCCCATTTT 
CACAGGAGTG 
GGGTGGCCAA 
CCCTGGTCCC 
CTGCCCTGCC 
ACCAGATGGA 
GGAAAGCACG 
CCTGAG6CCT 
ACTGTGG6TC 
GCAGTGCCTC 
GGGGOGATGG 
TTCACCTCAA 
CTGOTCCTTC 
ATCATGGTTT 
AAGGCTTGGC 
CACCCTGCCT 
ACATTGTCCT 
A AAGAG TCAR 
ACTTCAAC06 
TCTATCTGCA 
TCTCCATGCC 
TQGAGATCTT 
GCTGCATCCT 
AAGGTTTAAO 
CAGGTCCCCT 
6GAATGCAG0 
GAAATGTG6C 
ACAAAAAOGG 
TACCAGOAAA 
TCACTACCAG 
CCAATCACTA 
6QCTGCAATC 
TAOSOGCTCC 
GGGCTACAGG 
AAAGGGGCTA 
AGGAAAAGGG 
TACCACAGAA 
TC31CTACCAO 
CCAATCACTA 
GGGTOCAATC 
GAGACCCCCT 
CAGG6GGTGT 
GA6AGTCCAT 
GGCTGOGAGT 
TCTACACCAA 
GTGAGAT6GA 
TGAOGTGGCC 



21 
I 

0CCA6C00GT 
CCCTCCCCAT 
CCAGAAGOGG 
GGACOGGCAG 
GGAGATCTAC 
CCCTTCTGGG 
CCTCCTGGGT 
TOCGGACAGC 
TACCCATGAA 
CAAGTTGTTC 
GGOCCGTGGT 
CTCTGGACGC 
CCTCATTGAA 
TGTGGCTGCT 
AAGCATTGGT 
AGATGAG6AS 
CGGGCAGTTC 
TGG06TGCCC 
CAAGAAOCTA 
TTTCCCTGCC 
GACACATGQA 
AATQGACCCC 
OGC6TOGCCA 
CA6GCAGAAG 
CCAA6CCACA 
ACAAAGAACA 
CCATCTAGTG 
GCGAGA&ATT 
CAACTTCGCC 
TGGGAATCCT 
AAAAGAGGAO 
GGAAGCAGAA 
CAGAGGCAGC 
GATGTCTTTT 
GGATATCTTC 
CACCTTTGTG 
ATATCACCTC 
CTCCTCCTOC 
GTCTATGAGC 
CTCCAGTGOC 
TCTGGCAAGA 
QTG6TCTGGC 
CGTGG6TGTQ 
TGATGGCAAA 
AATGA6CACC 
CTCCCCTGTC 
CTTCAACAAC 
G0CTOGTO9C 
AATCAGAAQG 
GATGAAGAAA 
GCAAAGCCTG 
CCAAAGT6TG 
6CTACAG0GT 
AGGGGCTAOS 
GAAAAGGGGC 
CCAGGAAAAG 
ACTACCAGGA 
AATCACTACC 
GTCCAATCAC 
CGGGCTCCAA 
GCTACAGGGT 
AOGGGCTACQ 
GAAAAGGGGC 
CCAGGAAAAG 
ACTACCACAG 
CATCGAGGAG 
GGTGACOGAC 
GGCTCAGCOG 
GGTQCCAGQC 
GAAGAGTGGG 
GCCCGTGGCA 
AGATGGCAAG 



31 
I 

TGCTCTGGTT 
TCCTCCTCCA 
CK^TGAACA 
GGGAACGCCA 
TTCCTCAACA 
CTCCACA6AA 
CTGCCTCCAC 
AGQCAGGGAG 
CCA6AACCAT 
AAGTTCCGCA 
GTGGCCAGCC 
TACTCTATCr 
ATGGACCCTG 
GAGGCTGGGG 
GAGATATCTG 
CACAGTGGGG 
AAGGAAGAAG 
AGAGGA06TG 
TTTGX3C0CAC 
CGCCAAGCCC 
COTCTGGCTO 
AAATGTAAGG 
GCGCTCAGCA 
GG6CAGGCCA 
CAGCACCTGC 
GAOGGAGATC 
GCCACCATGC 
GGGAGAGAGA 
AGCTGCTTGA 
GGGAACTGGG 
GGGAAGATTC 
TTCCCCCCAG 
CCTGTCCTCC 
CTAGG6GGCC 
TGOGACAATG 
GAOGCTGCGG 
TGCAGAGATT 

TTTACAA6GA 
CACGGGAGGA 
GCTCCCTGTG 
ATCCC AQAGA 
GAO6AC0C0C 
6TGGACAT06 
CATGGGAAGG 
CGCACG6TCA 
ATTGCCTACC 
TCTTCArCCT 
GGAGGGTTCC 
CA6AAAGGAA 
GCCAAGGA6C 
CCCAGAACOC 
CCAAZGftCTA 
GQOTCCAATC 
TAOGGGGTCC 
GGGCTACAGG 
AAAGGGGCTA 
AOQAAAAOQG 
TACCAGGAAA 
TCACTACCAG 
CCAATCACTA 
G6CTCCAATC 
TAOGGGCTOC 
GGGCTAC6GG 
AAAGGGGCTA 
CTCAATCCOG 
TTOGA06GAG 
CTGTCCQTCT 
ACC06GTTTG 
GCCCACCTGA 
CACTTTGGCC 
ATGGTGAGCC 



AAEAGVSKYT 
BQHGRGVALA 
ITADFQHDQB 
VTDFDGDGKL 
KSGAHLRIID 
Ly?RDBDTX4 
KKXCSKGYEP 
L6SWKESCB 



41 
I 

GGATGGGACT 
0GTACAATG6 
TOGOGGTCGA 
TOGGGGTCAC 
CCAATAATGC 
ACAQ60CTGT 
TCA6CG6AAG 
AGAGGGTGCC 
TTCTTCTGAG 
ATAACGGGTG 
TCTTTGC0G6 
ACATTGCCAA 
AGGCCAGTGA 
TCAGCAAATA 
GCAGAACCGA 
ATGGAAGCAC 
CA6CAGCTTT 
TTG6AACAGC 
CATGTTACTA 
CCCAACACTA 
GAAAACTAGC 
GCOGCCATGC 
CCACTGTGGT 
TGTCCAGATO 
CTGCTAGAGA 
CAGGQAGGAG 
CAGCTCT06G 
CTGGQGCAGT 
GGCCTCTTGA 
TTCTOGACAT 
ATGGAGACCA 
GCTCCTCTGA 
AGGTGG6CCT 
GA0606TCAO 
AGAATGGGCC 
CCAGTOCTGA 
TTCCTCACTC 
ATGCA06TCT 
CCGGGTCA06 
CACTCAGCCT 
TCCTGGGGTC 
GCCTGAT6AC 
ACCA0CATG6 
TCTATGGCAA 
TC06CTTC0G 
TCACCGCOGA 
6CAGCTCCTC 
TGAGAGCTGG 
CAGG6CCAGG 
GGAAGGACGA 
OGGCCTCTGC 
AAG06CCACA 
OCAGGAAAAO 
ACTACCAGGA 
AATCACTACC 
GTCCAATCAC 
CAGGGTCCAA 
OCTAOGGGGT 
AGGGGCtAOG 
GAAAAGGGGC 
CCAGGAAAAG 
ACTACCAGGA 
AATCACTACC 
CTCCAATCAC 
CGGGGTCCAA 
G0GAC6CCTT 
ACGGGATGCT 
TCC6GG6CAA 
GGGCCTTTGC 
GGATCAT06A 
TGGGGAAGGA 
GGAACGTGQC 



GGR6VSVGPI 
DFMRD6KVDI 
LEIPFNNIAY 
DLILSHGESM 
GGSGYLCEME 
DPAPXiBOGQG 
NEDGTACVGT 
PSC 



51 
I 

GGGTQG6CCC 
ACCCAACCTG 
T6AGCGCAGC 
AGCCTGCGAC 
CTTCTOGGGC 
6CTGAAG0CT 
G6ACTTTTCC 
GGTTCCCTGC 
ACCCAAATCA 
GGAAGACATC 
ACGCTCTGTG 
TTAOGOCTAC 
CCTCTCCCGG 
TACAGAAGGC 
GGAGCGGGAA 
CAOCCAACTG 
GGTGGAGGAA 
TCTGCAGACT 
TTCTGTCTGC 
CCCTGTAGCC 
CCGGAGTGTC 
TGAGCCOGGC 
GCCAGQG6GC 
T6CACTGAG0 
GCTGTATGAC 
AAGGGACTCG 
GG6ACTGGA6 
AGQ AAGA CCA 
AGCGGGGACA 
GGCCAAGGCC 
TGAGCCCAGA 
GGAGCCTCTG 
OGGGCTTQCT 
GGTGGGCCCC 
TAACTTCCTT 
AOGTOGTTTA 
CCTGTG CCAC 
TCTTCAGGCr 
GTTCTATTCA 
OCAGOGTTCT 
TCTGATCCCC 
CCACAGCTAT 
G0BAGGT6TC 
CTGGAATGGC 
6QACAT0GCC 
CTTTGACAAT 
AGCCAACCGC 
TT3Q6A06AAC 
6GGTCAGGCC 
GGACTGGGCA 
TATT6CAGGG 
AOATACAAAG 
GOGCTAOGGG 
AAAGGGGCTA 
AGGAAAAGGG 
TACCAGGAAA 
TCACTACCAG 
CCAATCACTA 
G6GTCCAATC 
TAOGGGGTCC 
GGGCTACAGG 
AAAGGGGCTA 
AGGAAAAGA6 
TACCAGGAAA 
OGTCATCCGT 
GGAGCCTGAG 
GGACCTCATC 
TCAG6GCTTC 
CAGG0GA6CT 
OGGGGGCTCA 
TGAAGCCAGC 
CAGOQGGGAG 



240 
300 
360 
420 
460 
540 
600 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780' 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1660 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 



367 
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ATGAACTCAG TGCTGGAQAT CCTCTACCCC OGGGATGACG AC3«3VCTTCA GGACCC3VGCC 4380 

CCACTGGAGT GT6GCCAAGG ATTCTCCCAG CAGGAAAATG GCCATTGCAT GGACACCAA7 4440 

GAATGCATCC AGTTGOCATT 0GTGT6CCCT 0GAGACAA6C COGIATGTGT CAACACCTAT 4S00 

GGAA6CTACA GOTGOCGGAC CAACAAGAAO TGCAST0GG6 GCTAOSAGOC CAAC6AGGAT 4S60 

G6CACA6CCT G087G66TAC TGA6CTAGGC TCTAGGCATA CAATGAOCTre GAAACCAAG6 4620 

CCCAAAAAGQ AGCTGCAACT TTCCCAAGGC ATCTGCACCC COGTCTGGTC CTrm ' Ct 'T G 4680 

CCGGGTTGCC GGCTGCTCCT CAAAAGAGCT CAGCTCCAGG CTGCTCCCAG CACCCTTCTC 4740 
CAGAAAGCTC CAGGTATTCC AGAAGCCCAA GTCTATGAAC AAGATCAG6A ATAA 

Seq XD NOt 473 Protein sequence 
Protein Accession #: FX^afESH predicted 

1 11 21 31 41 51 

I I t i I I 

MACPG6LPAR CSGNMGLGGP SGSSPASPPH SSSRYNGPNL VLKYDRAQKR LVNIAVDER5 60 

SPYYALRDRQ GKAIGVTACD IDGDGREEZY FLNTimAFSG H5SSAQVPSG LHSNRPVLXP 120 

PPT7PAGLLG LPPLSGRDFS SSLGQASPDS RQGERVFVPC CRG6LRPTKE PEPPLLRPKS 180 

GVATYTDKLP KFRKKRHEDI LSDEVNVARG VASIjFAGRSV ACVDRXGSGR YSIYIAKYAY 240 

GNVGPDAXjIE MDPEASDLSa GILALRDVAA EAGVSKYTEG PSETASPSIG EISGRTEERE 300 

GGDPBEA0E8 HSGDGSTSQL CRLGHHDGQF KEEAAALVEE QREA6AAGVP RGRVRTAIiQT 360 

8KSBLADXNL FGPPCYySVC APSPAHPFPA RQAPQKYPVA PLVTQZMTE6 RLA6KZARSV 420 

PBPRAPQIDP RCKGRHAEP6 LMAEALGAHP ALSTTWPG6 LRSWEESRQK GQAKSRCALR 480 

ELGGPWSQAT QHLPARELYD IiGEPPILORT DGDPGRHRDS PKVTQECHLV ATKPALGGIiE 540 

GPGRVAKREI GRBTGAVGRP LSHPLVPI7FP SCLRPLEAGT VPGAAU>GMP GHWVLDMAKA 600 

LAWNQKEXEE GRIHGDHEPR PRUUCAREAB FPPGSSEEFL LQPPSGLRGS FVLQVGLGIA 660 

SATHGGSHSF LGGR6VSVGP ILSSSASDZF OSSaUSBHWL FB2IRGDGTFV DAAASAERRL 720 

APrVELXyHL CRDFFHSIfCH LAETGPSSSC CPWHARLLQA FBCHHGLSMS FTRT6SRFYS 780 

FLTQGIASSA HRRTIiSLQGS QGAPPCLLAR APCVU5SLIP TAYYIVIMSA -IPESLMTHSY 840 

LSSERVNV6V DDPHQHGBGV ALADFNROGK VDIVYGNWNG PHRLYLQMST HGKVRPRDIA 900 

SPKFSMPSFV RTVITADFDN DQELEIFFNN lAYRSSSANR LFRCSZIAR6 SSSLTAGGRN 960 

GQ6EGLRIRR GGFPGPGGQA KVNTGPLMKK QKGRKDBDMA R6CGNAGQSL AREPASAIAO 1020 

KGRCaiVAQSV PRTQAPQDTK PHYHKKGLQO PITTRKRGYG VQSI*PGKGAT GSNBYQBKGI* 1080 

RGPITTRKRG YGVQSLP6KG ATGSNHYQBK GLQGPITTRK RGYGICSLPG KGATGSNHYH 1140 

RKGLRAPITT RKRGYGVQSL PGKGATGSNH YQEKGLRGPI TTRKRGYGLQ SLPGKGATGS 1200 

NHYQEKGLQG PZTTRKRGYR VQSLPQRGAT GSNBYQBRGIj RGPZTTRXRG YGLQSLPGKB 1260 

AMGSNBYQEK GLRAPITTRK R6YGVQSLPQ KGATQSNVIR REHODPliIEB XiNPGDALEFB 1320 

6RGT6GWTD FZX3DGMLDLZ ZiSHGESKAQP LSVFRGNQGP NNNWLRWPR TRFGAFARGA 1380 

KWLYTKKSG AHLRZIDGGS GYZiCEMEPVA HFGL6RDEAS 8VEVTWPDGK KVSR19VASGE 1440 

MNSVIiEILYP RDEDTLQDPA PIiECGQGPSQ QENGHCMDTN ECIQFPFVCP RDKPVCVNTY 1500 

6SYRCRTMKK CSRGYEPNED GTACVGTELG SRHTMTWKPR PXKELQLSQG ZCTPVWSFFIi 1560 
P6CRLUJCRA QLQAAPSTLL QXAFGZPEAQ VYBQDQE 

Seq ZD NO: 474 DNA sequence 

Nucleic Acid Accession #: NN_003661.1 

coding sequence; 1..1152 " 

1 11 21 31 41 51 

I I I I I i 

ATGAGTGCAC TTTTCCTTGG TGTGGGAGTG AGGGCAGAGG AAGCTGGAGC GAGGGTGCAA 60 

CAAAACGTTC CAAGTGGGAC AGATACTGGA GATCCTCAAA GTAA6CCGCT OGGTGACTGG 120 

GCTGCTG G CA CCATG6ACCC AGAGA6CAGT ATCTTTATT6 AQGATGGCAT TAAGZATTTC 180 

AA6GAAAAAG TGAGCACACA GAATCTGCTA CTCCTGCTGA CTGATAATQA OGCCTGGAAC 240 

GGATTCGTGG CTGCTGCTQA ACTGCCCAGG AATGAGGCAO ATGAGCTCOO TAAAGCTCTG 300 

GACAACCTTG CAAQACAAAT GATCATQAAA GACAAAAACT GGCACGATAA AGGCCAGCAG 360 

TACAGAAACT GGTTTCTGAA AGAGTTTCCT OGGTTGAAAA GT6AGCTTGA GGATAACATA 420 

AGAAGGCTCC GTCCXXTTTGC AGATGGG6TT CAGAAGGTOC ACAAAGGCAC CACCATOGCC 480 

AATGTGGTGT CTGGCTCTCT CAGCATTTCC TCTGGCATCC TGACCCTOGT CG6CATGGGT 540 

CTGGCACCCT TCACAGAGGG AGGCA6CCTT GTACTCTTGG AACCT6GGAT GGAGTT6GGA 600 

ATCACAGC06 CTTTGACOQG GATTACCAGC AGTACCATGQ ACTACX^GAAA 6AAGTGGTGG 660 

ACACAAGCCC AA6CCCACGA CCTGOTCATC AAAAGCCTTO ACAAATTGAA GGA6GTGAGG 720 

GAGrrrrreo gtqagaacat atccaacttt ctttccttaq ctggcaatac ttaccaactc 780 

ACACGAGGCA TTGGGAAGGA CATCOSTGCC CTCAGACGAQ CCAGAGCCAA TCTTCAGTCA 840 

GTACOGCATG CCTCAGCCTC AOGCCCCCGG GTCACTGAGC CAATCTCAOC TGAAABCGGT 900 

GAACAGGTGG AGAGGOTTAA TGAACCCAGC ATCCTGGAAA 76A6CAGAGG AGTCAAGCTC 960 

ACGGATGTGG CCCCTOTAAG CTTCTTTCTT GT6CTGGATG TAGTCTACCT CGTGTACGAA 1020 

TCAAAGCACT TACATGAOGG GGCAAAGTCA GAGACAGCTG AG6A6CTGAA GAAGGTGGCT 1080 

CAGGAGCtGG AGGAGAAGCT AAACATTCTC AACAATAATT ATAAGATTCT GCAG006GAC 1140 
CAAGAACTGT GA 

Seq ZD NOi 475 Protein sequence 
Protein Accession 6: HP_003 652.1 

1 11 21 31 41 51 

I I I I I I 

MSAIjFLGVGV RAEBA8ARVQ QNVPSGTDTG DPQSKPX/SDW AAGTMDPBSS ZPZEDAZKYF 60 

KEKVSTQHLL LLL'mTEAMK GEVAAAELPR NEADELRKAL DKLAROMIMK DKKV7HDXGQQ 120 

YRZ2WFLKBFP RLKSELaDNX RRLRALAXX^^ QKVHKGTTZA HWSGSLSZS SGZLTLV04G 180 

IAPFTEG6SL VLLBPGMEZXS ZTAALT6ZT6 STMDYGKKWH TQAQAHnLVI KSLDKLKBVR 240 

BFLGESnSMP 1.SLAG13TYQL TRGIGKDZRA USRARAMLQS VPBASASRFR VTEPZSABSO 300 

BQVERVHSPS ZIiENSRGVKL 1I3VAPVSFFL VlfWYLVYE SKBZAEQAXS ETABELKBVA 360 
QELBEKXmZ. NZINYKILOAD QBL 

Seq ZD NO: 476 DMA sequence 

HUdeic Acid Accession St 11K_014452.1 

Coding sequence! 1..1968 

1 11 21 31 41 51 
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ATGQGGACCT CTCCGAGCAG CAGCAC0C5CC CTOGCCTCCT GCAGCOGCAT CGCCCS3C0GA 60 

GCCACAGCCA CGATGATOGC GGGCTCCCTT CTCCTGCTTG GATTCCTTAG CACCACCACA 120 

GCTCAGCCAG AACAGAAGGC CTOGAATCTC ATTGGCACAT ACXX3CCATGT TGACCGTGCC IBO 
ACOSGCCAGG TGCTAACCTa TGACAAGPCST CCAGCAGGAA OCTATGTCTC TGAGCATTGT- 240 

AOCAACACAA GCXTTGCGCGT CT6CA6CAGT TGCCCTGTGG G6ACCTTTAC CAGGCATGAC 300 

AATGGCATAO AGAAATGCCA TGACTGTAST CAGCCAT6CC CATG6CCAAT 6ATTGAGAAA '^360 

TTACCTTGTG CTGCCTTGAC TGACCGAGAA TGCACTTGCC CACCTGGCAT GTTCCAGTCT 420 

AAOGCTACCT GTGCCaX» TAOGGTCTGT CCTGTGGGTT GGGGTGTGCX3 GAAGAAAGGG 480 

ACAGAGACTG AGGATGTGOG GTGTAAGCAO TGTGCTOGGQ GTACCTTCTC AGATGTGCCT 540 

TCTAGTGTGA TGAAATGCAA AGCATACACA OACTGrCraA OTCAGAACCT GGTQGTGATC 600 

AAOCCQGGQA GCAAGGAOAC ACACAA06TC TGT6GCACAC TCCC G l'CCTT CTCCAfiCTCC 660 

ACCTCACCTT OCCCTGGCAC AGCCATCTTT CXAC6CCCTG AGCACATGGA AACCCATGAA 720 

GTCCCTTCCT CCACTTATGT TCCCAAAGGC ATGAACTCAA C3\GAATOC3U^ CTCTTCTGCC 780 

TCTGTTAGAC CAAAGGTACT GAGTAGCATC CAGGAAGGGA CAGTCCCTGA CAACACAAOC 640 

TCAGCAAGGG GQAAGGAAOA OGTGAACAAO ACCCTCCCAA ACCTTCAGGT AGICAACCAC 900 

CAGCAAGGCC CCCACCACAG ACACATCCIG AAGCTGCT6C OGTCCATGGA 66CCACTGG6 960 

GGOGAGAAGT OCAGCACGCC CATCAAGGGC CCCAAGAGGG GACATCCTAG ACAGAACCTA 1020 

CACAAGCATT TTGACATCAA TGAGCATTTG CCCTGGATGA TTGTGCTTTT CCTGCTGCTG 1080 

G'lXXTfGltjG TGATTGT66T GTGCAGTATC OGGAAAAGCT OQAGGACTCT GAAAAAGGGO 1140 

CCCOGGCAGO ATGOCAGTQC CATTGTGQAA AAGGCAGGOC TGAAGAAK7C CATGACTCCA 1200 

ACCCASAACC GGGAGAAATG GATCTACTAC TGCAATGGCC ATG6TAT0GA TATCCTGAA6 1260 

CTTGTAGCAG CCCAAGTGGG AAGCCAGTOO AAAGATATCT ATCAGTTTCT TTGCAAT6CC 1320 

AGTGAGAGGG AGGTTGCTGC TTTCTCX3U^T GGGTACACAG CX33ACXACGA GCGGGCCTAC 1380 

GCAGCTCTGC AGCACT6GAC CATCC3GGG6C CCCGAGGCCA GCCT06CCCA GCTAATTA6C 1440 

GCOCT G OGOC A6CA00G6AG AAAOGATGTT GrTGGAGAAGA TTQGTG6GGT GATQ6AAGAC 1500 

ACCACCCA6C TGGAAACTGA CAAACTAGCT CTCOOGATGA GCCCCAGCXX GCTTAGCCC6 1560 

AGCCCCATCC CCAGCCCC3VA CGCGAAACTT GAGAATTCCG CTCTCCTGAC GGTGGAGCCT 1620 

TCCCCACAGQ ACAAGAACAA GGGCTTCTTC GTGGATGACST CGGAGCCCXTT TCTCCX5CTGT 1680 

GACTCTACAT CCAGCGGCTC CTCCGC6CTG AGCAGGAAOQ 6TTCCTTTAT TACCAA AGAA 1740 

AAGAAGGACA CAGTGTTGOG GCA06TA0SC CTGGACX^CCT GTGACTTGCA GCCTATCTTT 1800 

GATGACAT6C TCCACTTTCT AAATCCTGAG 6AGCTG0G6G TGATTGAAGA QATTCCCCAG 1860 

6CTGAGGACA AACTAQACCXS 0CTATT03AA ATTATTOGAG TCAAGAfiCCA 66AAGCCAGC 1920 
CAGAOCCTCC TG6ACTCTGT TTATAGOCAT CTTCCT6ACC TGCIGTAG 

Seq ZD KO: 477 Protein sequence 
Protein Accession 8t HP_055267.l 

1 11 21 31 41 51 

1 I I I I i 

MGT5PSSSTA lASCSRXARR ATATKZAGSL LXiLGFLSTTT AQPBQKASKL ZGTYRHVDRA 60 

TGQVLTCDKC PAQTYVSEHC TNTSLRVCSS CPVGTPTKHB KOIEKCHDCS QPCFWPKIEK 120 

LPCAALTDRE CTCPP6MFQS HATCAPHTVC FVGW6VRKK0 TETEDVRCKQ CARGTFSDVP 160 

SSVMKCKAYT DCLSQNI4WI KPGTKETDNV CXSTLPSFSSS TSPSPGTAIP PRPEHMETRE 240 

VPSSTYVPKO MKSTESNSSA SV&PKVXjSSI QEGTVPDHTS SARGKEDVKK TLPMIX3WIIH 300 

QQGPHBHHIL KLLPSMEATG GEKSSTPIK6 PKHGHPRQNL HKRFDINEHL PHMIVLFLLL 360 

VLWIWCSZ RXSSRTLKKG PRQDPSAIVE KAGLKKSMTP TQNREKWIYY OKSIGZDZLK 420 

LVAAQVGSQW KDIYQFLQ7A SEREVAAFSN GYTADHERAY AALQHWTIR6 PEASIiAQI'IS 460 

ALRQHRRNDV VEKIRGLMED TTQLETDKLA ItPMSPSPLSP SPIPSPNAKL ENSALLTVEP 540 

SPQDKNX6FF VDBSBPLLRC OSTSSGSSAL SRKGSFZTKB KKDTVLRQVR WPCDLQPIF 600 
DDMLHFU7PS ELRVZEEZPQ ASDXLDRLFB ZI6VKSQBAS QTLLDSVySB LPOLL 

Seq ID KOi 478 DNA sequence 
Nucleic Acid Accession XM.044533 
Coding sequenccx 238.. 2751 

I 11 21 31 ' 41 51 

I I I I I t 

GCTCTGCCCA AGCCGAGGCT 60GGG6CGGG CGC0G60GGG AGGACT606G T6CCC060GG 60 

AGGG6CTCAG TTTGCCAGGG CCCACTTGAC CCTGTTTCCC ACCTCCCGCC CCCCAOGTCC 120 

GGA6G0GGGG GCCCCOQGGO O6ACT06GGG GO^CCGCG OGC3C6GAGCT GCCGCCCGTG 160 

AGTCOOGCCG AGCCACCTGA GCXXGAOCOG OQGGACACOO TCGCTCCTGC TCTCOGAATG 240 

CTGCGCACOG 06AIGG6GCT GAGGAGCTOG CT06C06CCC CATGGGGOGC GCTGCOSCCT 300 

C66CCACGGC TGCTGCTGCT CCTGCTGCTG CTGCTCCTGC T6CASC0GCC GCCTCOGAOC 360 

TGGGCOCTCA GCCCCCGGAT CAOCCTGCCT CTGGGCTCTG AAQAGOGOCC ATTCCTCAGA 420 

TTCGAAOCTG AACACATCTC CAACTACACA GCCCTTCTOC TGAQCAGGGA TOGCAGGACC 480 

CTGTAOSTGG GXGCTCQAGA GGCCCTCTTT GCACTCAGTA 6CAACCTCAG CTTCCTGCCA 540 

OGGGG0GA6T ACCAGOAGCT GCTITGGGGT 6CAGA0GCAG AGAAGAAACA GCAGTGCA8C 600 

TTCAAG6GCA AGGACCCACA GCGCGACTGT CAAAACXACA TCAASATCCT CCT G COGCTC 660 

AGCGGCAGTC ACCTGTTCAC CTGTGOCACA GCAGCCTTCA GCCCCATOTG TACCTACATC 720 

AACATGQAGA ACTTCACCCT 6GCAAGG6AC GAQAAGGGGA ATGTCCTCCT GGAAGATGGC 780 

AAGGGCOGTT GTCCCTTC6A CC06AATTTC AAGTCCACTG CCCTGGTGGT TGATG60GAG 840 

CTCTAC31CTG GAACASTCAO CAOCTTCCAA GG6AATGACC C GG OC A TCTC GCGGAGCCAA 900 

AGGCTTOGOC CCACCAAGAC OGAGAGCTCC CTCAACTGGC TGCAAGAOCC AGCTTTTGTG 960 

GCCTCAGCCT ACATTCCTGA GAGCCTGGGC AGCTTGCAAG GOQATGATQA CAAGATCTAC 1020 

TTTTTCTTCA GCQABACTGG CCAQQAATTT GAOTTCTTTG AGAACACCAT TGTGTCCOBC 1080 

ATTGCCOGCA TCTGCAAGGG CGATQAGGGT GGAGA6CG6G TGCTACAGCA G06CT6GAGC 1140 

TCCTTOCTCA AQGCCCAGCT GCTOTGCTCA C6G0CCQA06 ATGGCTTCOC C TTCAAOS TQ 1200 

CTGCAGGATO TCTTCAC6CT 6A6CC0CA6C CC0CA66ACT G60G1X3ACAC OCTTTTCTAT 1260 

GGGGTCTTCA CTTCCCAGTG GCACAGGGGA ACTACAGAAG 6CTCTGCCGT CTGTGTCTPC 1320 

ACAATGAAGG ATGTGCAGAG AGTCTTCAGC GGCCTCTACA AGGAGGTGAA CCGTGAGACA 1380 

GASaurTGGT ACAC06TGAC CCAOCC6GT0 CCCACACGCC GOCCTGGAGC GT6CATCACC 1440 

AACAGTGCCC 6GQAAAGGAA QATCAACTCA TCCCTGCASC TCOCAQACOO OQTGCTGftAC 1500 

TT0CTCAA06 AOCACTTOCT 6AT6GA0660 CAQGTCGQAA GCC6CAT6CT GCT6CTGCAG 1560 

CCCCAGGCTC GCTACCAGCG OGTGGCTGTA CACCGCGTCC CTGGCCTGCA CCACACCTAC 1620 

GATCTCCTCT TCCTGGGCAC TGGTGAOGGC OGGCTCCACA AGGCAGTGAG COTGGGCCCC 1680 

CQGGTGCACA TCATTGAGGA GCTGCAGATC TTCTCATGQG GACAGCCOGT GCAGAATCIG 1740 
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CTOCTG GA CA CCCACACGGG GCT6CTGTAT 60G6CCTCAC ACTOGGGOGT A6TCCAG6TG 1800 

CCCATGGCCA ACTGCAGOCT GTACAGGAGC TGTGGGGACT GCCTCCTCGC CCGGGACCCC X860 

TACTOTGCTT GGAGOGGCTC CAGCTGCAAG CACGTCA6CC TCTACCAGCC TCAGCTGGCX! 1920 

ACCAOGCCGT GGATCC3«X5A CATOGAGGGA GCCAGCGCCA AGGACCTTTG CAGOGCXTTCT 1980 

TCX3GTTGTGT CCCCGTCTTT TGTACCAACA GGGQAGAAGC CAT6TGAGCA AGTCCAGTTC 2040 

CAGCCCAACA CAGTGAACAC TTrGGCCTGC OCOCTCCTCT CCAACCtGQC GftCCOGACTC 2100 

TGGCTAOCCA ACGGGGCCCC OQTCAATGCX: TOGGCCTCCT GCCACX3TGCT AOCCACTOGQ 2160 

GACCTGCTGC TGGTGGGCAC CCAACAGCTG GGGGAGTTCC AGTXXTPGGTC ACTAGAGGAQ 2220 

GGCTTCOVGC A6CPGC5TAGC CAGCTACTGC CCAGAGGTGG TGGAC3GA0GG GGTGGCAGAC 2280 

CAAACAGATG AGGGTGGCAG TGTACCOGTC ATTATCASCA CAT06CGTGT GAGTCCA CCA 2340 

GCTGGTGGCA AGGCCAGCTG GGGTGCAGAC AGGTCCTACT GGAAGGAGTT OCTOGTGATQ 2400 

TGCACGCTCT TTGT G CTGGC OOTGCTQCTC CCAGTTTTAT TCTTGCTCTA C06GCAC0GG 2460 

AACAGCATGA AAOTCTTCCT GAAGCASGGG GAATGTGCCA GC3GTGCACCC CAAGACCTGC 2520 

CCTGTQgT G C TGCCCCCTGA GACCOGCOCA CTCAAOGQOC TAGGGCCCOC TAGCACCCCX3 2580 

CTOGATCACC GAGGGTACCA GTCCCTGTCA GACAGCCCCC CGGGGTCC06 AGTCTTCA CT 2640 

QASTCAGAGA AGAGGCCACT CAGCATCCAA GACAGCTTCG TGGAQGTATC CCCWSTGTCC 2100 

CCCOGGCCCC GGGTCOSCCT TGGCTCQGAG ATCCGTGACT CTGTGGTGTG AGASCTGACT 2760 

TCCAGAGGAC GCTCCCCTGO CTTCAGGGGC TGTGAATGCT C3GGAGAGGQT CRACT GGAC C 2820 

TOXCTCOGC TCrOCTCTTC GTG6AACAC0 ACCGTGGTGC CC3QGCCCTTG GGAGCCTTGG 2880 

GGCCAGCTGQ OCTGCIGCTC TCCAISTCAAG TAGCGAAGCT CCTACCACCC AGACACCCAA 2940 

ACAGCCX3TGG CXXX^WSAGGT CCTGGCCAAA TATGGGOGCC TGCCTAGGTT GGTGGAACAG 3000 

TOCTCCTTAT GTAAACTGAO OXTri V m AAAAAACAAT TCX3AATGTC AAACTAGAAT 3060 

GAGAGG6AA6 AGATAGCATG GCATGCAGCA CACA06GCTG CTCOVGTTCA TGGCCTCCCA 3120 

GGG6TGCXG0 GGAT6CATCC AAA0T0GTT6 TCTGA6ACAG AGTi GG AAAC CCTCACCAAC 3180 

TX5GCCTCTTC ACCTTCCACA TTATCCOGCT GCCACOGGCT GCCCTGTCTC ACTGCAGATT 3240 

CAQGACCAGC TTOGGCTGCO TGOGTTCTGC CTTGCCAGTC AGCOGAGGAT GTAGTTGTTG 3300 

CTOC06T0GT CCCACCACCT CAGGGACCAG AGGGCTAGGT TGGCACTGCG GCCCTCACCA 3360 

QGTCCXGGGC TCGGACOCAA CTOCTGOACC TTTCCAGCCT GTAtCRGGCT GTG6CCACAC 3420 

GAGAGGACA8 CGOGftGCTCA GGAGACATTT OGTGACAATG TAOOCCTTTC CCTCAGAATT 3480 

CAGGGAAGAG ACTGTOGCCT GCCTTCCTCC GTTGTTGC3GT GAGAACCOGT GTGCCCCTTC 3540 

CCACCATATC CACCCTOGCT CCATCTTTGA ACTCAAACAC GAGGAACTAA CTGCACCCTG 3600 

GTCCTCTCOC CAGTCOCXAG TTCACCCTCC ATCCCTCAOC TTOCTCCACT CTAAG GGATA 3660 

TCAACACTGC CCAGCACS^ GGOOCTG A AT TTATOTOOTT TTTATAC ATT TTTTAATAAG 3720 
ATGCACTTTA TGTGATTTTT TAATAAA3TC TQAASAATTA CTGITT 

Seq ID HO: 479 Protein sequence 
Protein Accession #: XP_044533.3 

1 11 21 31 41 51 

I I I f 1 I 

MLRTAMdiRS WIAAPWGALP PRPPLLLLLL LIiLLLQPPPP TWALSPRISL PLGSEERPPL 60 

RFEAmiSNY TALLLSRDGR TLYVGAREAL FALSSNLSFL PGGEYQELLW GADAEKKQQC 120 

SFIOBXDPQBD OONYXXIIiLP LSGSKLFTOG TAAFSPMCTY ZimESIFTZAR DEKGNVLIiED 180 

GKGRCPPDPH PKSTALWDG ELYTGTVSSP QGNDPAISRS QSLaPTKTBS SIHWLQDPAF 240 

VASAYIPBSL GSLQGDDDKI YPFPSBTGQE FEFFBNTIVS RIARICKGDE GGERVI^RW 300 

TSPliKAQIiLC SRPDDGFPFN VLQDVFTLSP SPQDWRDTIiP YGVFTSQWHR GTTBGSAVCV 360 

FTMKDVQRVF SGLYKEVNRE TQQWYTVTHP VPTPRPGACI TNSARE RKIN SSLQLPDRVL 420 

HFLKDHFLMD GQVRSRMLXjL QFOARYORVA VBBVPGLHHT YUVJiFUffSGD 6RLBKAV5V6 480 

PRVHIIEELQ IPSSGQPVQH LLMTHRGLI* YAASKSOWQ VPMANCSLYR 8CGDCLIARD 540 

PYCAWSGSSC KHVSLYQPQI. ATRPWIQDIE GASAKDLCSA SSWSPSFVP TGEKPCEQVQ 600 

PQPNTVNTIA CPLLSNLATR LWLRNGAPVN ASASCHVLPT GDLLLVGTQQ LGEFQCWSLE 660 

EGFQQLVASY CPBWEDGVA DQTDEGGSVP VIISTSRVSA PAGGKASWGA DRSYWKBFLV 720 

MCTLFVLAVL LPVLPLLYRH SNSMRVFLRQ GECASVHPKT CFWLPPETR PLNOLGPPST 780 
PLDHRGYQSL SDSPPGSRVP TESBKRPLSI QDSFVBV8PV CPRFRVRLQS BIRDSW 

Seq ID NOt 480 DliZA sequence 

Nucleic Acid Accession «i NM_004217.1 

Coding sequence: 58.. 1092 

1 11 21 31 41 51 

GGC06GGAGA GTAGCAGTGC CTTGGACCCC AGCTCTCCTC OCOCrTTCTC TC TAAQGATG 60 

GCCCAGAAGG AGAACTCCTA CCOCTGGCCC TAOOGCOGAC AGAOGGCTCC ATCTGQOCTO 120 

AGCACCCTOC CCCAGCGAGT CCTCOGGAAA GAGCCTGTCA CCCCATCTOC ACTTGTCCTC 180 

ATCAOCOGCT CCAATGTCCA GCXXSVCAGCT GCCCCTGOCX: AGAA SGTGA T GGA6AATAGC 240 

AGTGG6ACAC CCGACATCTT AA060GGCAC TTCACAATTG ATGACTTTGA GATTGGGOGT 300 

CCTCTGG6CA AAGGCAAGTT TGGAAAOGTO TACTTGGCTC GGGAGAAGAA AAGCCATTTC 360 

ATOGTGGOSC TTAAGGTCCT CTTCAAGTCC CAGATAGAGA A06AGGG06T G GROm TCAO 420 

CTGOGCAGAG AGATOGAAAT CCAGGCCCAC CTGCACCATC OCAACATCCT GCGTCTCTAC 480 

AACTATTTTT ATGACCGGAG GAGGATCTAC TTGATTCTAG AGTATQCCCC COGCGGGGAG 540 

CTCTACAAGG AGCTGCAGAA GAGCTGCACA TTTGAOGAGC AGOGAACAGC CACGATCATG 600 

GAGGA6TT6G CAGATGCTCT AATGTACTGC CATGOGAAGA AGGTGATTCA CAGAGACATA 660 

AAGCCAGAAA ATCTGCTCTT AGGGCTCAAG GGAGAGCTGA AGATTGCTSA CTTOOTCTGG 720 

TCTGTGCATG CGCCCTCCCT GAGGAGGAAG ACAATGTGTG GCACCCTGGA CTACCTGCCC 780 

CCASUSATGA TTGAGGCGOG CATGCACAAT GAGAAGGTGG ATCTGTQGTG CATT6GAGTG 840 

CTTTGCTATG AGCTGCTGGT GGGGAACCCA CCCTTTGAGA GTGCATCACA CAAOGAGACC 900 

TATOGOOGCA TOGTCAAGGT GGACCTAAAG TTCCCCGCTT CTGTGCCCAC GGGAGCCCAG 960 

GACCTCATCT CCAAACTOCT CAGGCATAAC OCCTCGGAAC GGCTOCOCCT QGOOCAOQTC 1020 

TCAGCCCACC CTT0GGTCCX3 GGCCAACTCT OGGAGGGTGC TGCCTCCCTC TOOOCTTCAA 1080 

TCTGT06CCT GATOgTC CC T GTCATTCACT GGGGTGOSTG TGTTTGTATG li.i,^3 T^TATG 1140 

TATAOGGGAA AGAAGGGATC CCTAACTGTT OCCTTATCTO TTTICIAOCT CCTCCTTTGT 1200 
TTAATAAAGG CPGAAGCTTT TTGT 

Seq ID NO: 461 Protein sequence 
Protein Accession #i NP_004208 

1 11 21 31 *1 51 
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KAQKEHSYPW PYGRQTAPSG LSTLPQRVLR RBFVTPSALV LMSRSNVQPT AAFGQEVMBH 60 

SSGTPDILTR HFTXDDFEIG RPLGKGKFGN VYUXREKKSH FZVALKVI.FK SQIEKEGV^ 120 

QIiRREIEIQA HLHHPNILRL VNYFYDRRHI YLILEYAPRG ELYKELQKSC TFDEQRTATI 190 

NEELADALMY CB6KKVIBRD ZKPEKLLLGIi KGBLKIADFG WSVHAPSLRR KTKCGTLDYL 240 

PPQ42EC3MB HBKVDLHCI6 VLCVBZiLVGH PPFBSASHHS TYRRIVKVDL KFPA8VPTGA 300 
QDLISKLLRH NPSERLPIAQ VSARPHVHAN SRRVLPPSAL QSVA 

Seq ZD NO: 482 DNA sequence 
NUcleic Acid Accession S: AK055663 
Oodlng sequence; 3 6.. 1423 

1 11 21 31 41 51 

1 1 I I I I 

AGAA060CTT CaSGCGGGAG CTGT6CAGCT CCTTATCATG G6GACAATTC ATCTCTTTQ6 60 

AAAACCACAA AGATCCTTTT TTGGCAAGTT GTTACGGGAA TTTAGACTT6 TAGCA6CTGA 120 

CCGAAGGTCC TGGAA6ATAC T6CTCTTTG6 TGTAATAAAC TTGATATGTA CTGGCTTCCT 180 

GCTTATGTX3G TGCAGTTCTA CTAATAGTAT AGCTTTAACT GCXTATACTT ACCTGACCAT 240 

TTTTGATCTT TTTAGTTTAA TGACATGTTT AATAAGTTAC TGGGTAACAT TGAGGAAACC 300 

TAGCCCTGTC TATTCATTTG GGTTTGAAAG ATTAGAAGTC CTGGCTGTAT TTGCCTCCAC 360 

AGTCTTC5GCA CAGTTGGGAG CTCTCTTTAT ATTAAAAGAA AGTGCAGAAC GCT TTTT QGA 420 

ACAGCCCGAG ATACACACGQ GAAGATTATT AGTTGGTACT TTTGTGGCTC TTT6TTTCAA 480 

CCTGTTCACX3 ATGCTTTCTA TTCGGAATAA ACCTTTTGCT TATGTCTCAG AAGCTGCTAG 540 

TACGAGCTGG CTTCAAGAGC ATGTTGCAGA TCTTAGTOGA AGCTTGTGTG GAATTATTCX: 600 

GGGACTTAGC AGTATCTTCC TTCCCCGAAT GAATCCATTT GTTTTGATTG ATCTTGCTGG 660 

AGCATTTGCr CTTTGTATTA CATATATGCT CATTGAAATT AATAATTATT TTGCCGTAGA 720 

CACTGCCrCr GCTATAGCTA TTGCCTTGAT GACATTTGGC ACTATGTATC CCATGAGTGT 780 

GTACAGTGGO AAAGTCTTAC TOCaGACAAC ACCACCCCAT GTTATTGGTC AGTTGGACAA 640 

ACTCATCAGA GAGGTATCTA OCTTAGATGG AGTTTTAGAA GTCOGAAATG AACATTTTTG 900 

GACCCTAGGT TTTGGCTCAT TGGCTGGATC AGTGCATGTA AGAATTCGAC GAGATGOCAA 960 

TGAACAAATG GTTCTTGCTC ATGTGACCAA CAGGCTGTAC ACTCTAGTGT CTACTCTAAC 1020 

TGTTCAAATT TTCAAGGATO ACTGGATTAG GCCTGCCTTA TTGTCTGGGC CTGTTGCAGC 1080 

CRATOTOCT A AACTTTTCAG ATCATCAC3GT AATOCCAATC OCTCTTTTAA AGGSTACTGA 1140 

TGATTTGAAC CCAGTTACAT CAACTCCAGC TAAACCTA(5T AGTCCACCTC CAGAATTTTC 1200 

ATTTAACACT CCTGGGAAAA ATGTGAACCC AGTTATTCTT CTAAACACAC AAACAAGGCC 1260 

TTATGGTTTT GGTCTCAATC ATGGACACAC ACCTTACAGC AGCAT6CTTA ATCAAGGACT 1320 

TGGAGTTCCA GGAATTGGAO CAAC7CAAGG ATTGAGGACT GGTTTTACAA ATATACCAAG 1380 

XAGATATGGA ACTAATAATA GAATT66ACA AOCAAGACCA TGATAGACTC TAACTTATTT 1440 

TTATAAG6AA TATTGACTCC TTG6CITCCA ATTTATTTAG TAATCCAACT TTGCATTGAC 1500 

TGTTTAATCA TTTACTCTAA ATGTTAGATA ATAGTAGTCT TGTTCACATT TCATGAAACC 1560 

TATGAAACTA TATTTTTGTA AAATGTATrT GTGACAGTGA AATCCTCGTA AATGTTAAAG 1620 

GCTTTAAATA GGCTTTCTTT AGAAAATGTG TTTCTTTAAA TTTGGATTTT GGTATCTT7G 1680 

GTTTTGTAGT TGACTGCAGT 0T6ATGTQAC CTTACCTTTA TAA6AGCCAC TTQATG6AGT 1740 

AGATCTGTCA CATTACTAAG ATAOGATATT TCTTTTTTTT TC0GAGAC3GG AGTCTTOCTC 1800 

TGCCACTGTG CCCGGCCAAT ACATTATTAT TAACTTAAGG CTGTACTTTA TTAAGGCTTC 1860 

CTTAGTTTTT GTTTTGTTTT GTrTTTTGAG ATGGAGTCTC ACTCTGTCGC CCAGGCTGGA 1920 

ATGCAGTGGC ATGATCTCAG CTCACTGCSVA CCTCTGCCTC CTGAOTTCAA ATGATTCTCC 1980 

TGOCTCAGCC TGOOGAGTAG CTGGGATTAC AGGCACCTGC CACCAGGCCX: AGCI7ATTTT 2040 

TGTATTTTTA OTAAAGACOG G6GATTTCAC CATGTTGGCC AG6CT0GTCT TGAACTCCTG 21C0 

ACCTCATGAT CCACCXy^CCT TAGCCTCCCA AAGTGCTGGG ATTAGGTGTG AGCCACCGCA 2160 

CCTGGCCGAT ATTTPCTTTA ATGAAATTTA TAAATATGCT TCTTGAATAA TACACATTTT 2220 

GGGAAAGGGA AAAATGTCTG TTCAAAAAGT AAAGGTCTCT TTTATAGCTT TTCCAAACTT 2280 

AATTGCTAAA ITm ' Cmtj AGGTTCTCCT GAATZATGTC TTACAAACTA AAAfiCAAAAA 2340 

TTTTTAGCAO AAATTTTGGA ATACATTCTA TCTAGCACAA TTT6AATTTT TAATTATCAA 2400 
GATTTTTGTT AAAGTTTCTC TCCTTTAAAA ATTTTAGTAC ATTTGTAAAT 

Seq ZD KO: 483 Protein sequence 
Protein Accession Si BAB70980.1 

1 11 21 31 41 51 

I t I ] I I 

MGTZHLFSKP QRSFFGKLZiR EFRLVAADRR SWKILLFGVZ NLZCTQFLLM WCSSTNSZAL 60 

TAYTyLTIFD LFSLNTCLZS YWVTLRKPSP VYSFGFBRLB VLAVFASTVI* AQLGALFILK 120 

ESAERFZ.EQP EZBT6RLLVG TFVAIiCEKLF TMLSIRNKPF AYVSEAASTS HLQEHVADLS 180 

RSLCGZZPGL SSIFLPRMNP FVLZDLAGAF ALCXTYMLtE ZNNYFAVDTA SAIAXALMTP 240 

GTMYPMSVYS GKVLLQTTPP HVXGQUDKLI REVSTLDGVL EVRNBHPWTL GPGSLAGSVH 300 

VRIRRDANBQ MVLABVTNRL YTLVSTLTVQ XFKDDWZRPA LLSGPVAANV ZitJFSDHHVZP 360 

MPLLXGTDDl. NFVTSTPAKP SSPPPEFSFN TPGKNVNPVI LLNTOTRPYG FGLHBGaTPY 420 
SSMLNQGIiGV PGIGATQ6LR TOFINIPSRY GTXSmilGQPR P 

Seq ZD NO: 464 DHA sequence 

KUcleic Acid Accession 6: FGENESB predicted 

Coding sequence: 1..900 

1 11 21 31 41 51 

I I I I 1 1 

ATGCOGOOGC GGGA6CTGAG CGAOGCCGAG CC6CCXXCGC TCXZGGGCCCC GACCCCTCCC 60 

GCGG6GC0GC GTAGC60GCC CCXAGAGCTG 6GCATCAAGT G06T6CTGGT GGG0GA06GC 120 

GCCGTGGGCA AGAGCAGCCT CATCGTCA6C TACACCTGCA ATSG6TACCC 060S0GCTAC 180 

CGGCCCACTG CGCTGGACAC Cn'CTCT GG T ACGTACGTTC AATCGCCCGT G0GGC0GC6T 240 

GGCTGCGGCG GGGCTGTGCA COGGGOAGCT GGGGCGGGCG TCTCGG0GG6 AGG6CGCAGA 300 

GGACCC06GG GAGGAGACTG GAOCAGGCCC CX^GGTrGGOG CTGGTGOQGC CCAGGA06CT 360 

CTTCCTAACT CAGGCTCTGC 00G00CC6GC CCTGCAGITQC AAGTCCIG6T GGATGGAGCT 420 

C06GTG06CA TTGAGCTCTG GGACACAGGG GGACAGGAGG ATTTTGACCSG ACTTOGTTCX: 480 

CTTTGCTACC OGQATACOGA TGTCTTCCTG GCGTGCTTCA GCGT6GTGCA GCCCAGCTCC 540 

TTTCAAAACA TCACAGAGAA ATGGCTGCCC GAGATCCGCA CGCACAACCC CCAGGOOCCT 600 

GTGCTGCTGG TG6GCACCCA GGCOGACCIG AGGGAOGATG TCAAC8TACT AATTCA6CTG 660 
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GAOCAGGGGG G00GG6AG6G COOCXST GC CC CAAGCCCAG6 CTCAGQGTCT OQCGQAOAAO 720 

AT0C3GAGCCT GCP G CT AO CT TC5W3TGCTCA GOCTTGAOGC AGAAOAACTT GAAOQAASTA 780 

TTTGACTGGG CTATTCTCAS TGCCATTGAG CACAAAfiCCC GGCTGGftGAA GAAA CTGRA T 840 
GOCAAAGGTG TQOGCACCCT CTCCOGCTGC 0GCTGGAA6A ASrrCTTCTG CTTOGTTTGA 

Seq ID NO: 485 Protein sequence 
Protein Accession St FOaiSSB predicted 

1 11 31 31 41 51 

MPPRELSEAB PPFIiRAPTPP FRRSSAPPEL GIXCVLV(3)6 AVGXS5LIVS YTOiaYPARy 60 

RPTALDTFSG TYVQSPVRPR GCGGAVERGA GA6VSA06RR GPRCSGDWSRP RGGAGAAQDA 130 

IiPNSGSPRPA PAVQVLVDGA PVRIELWDTA GQEDFDRLRS LCYPDTDVPL ACPSWQPSS 180 

PQNITEiWLP EIRTHNPQAP VLLVGTQADL RDDVNVLIQIi DQGGREGPVP QP QAQGL ABK 340 
IRACCYLECS ALTQKNLKBV FDSAILSAIE HKARLEKKLN AKGVRTLSRC RWKKFFCFV 

Seq ID SO: 486 DIZA sequence 

nucleic Acid Accession ft: XM_063833.2 

Coding sequence: 1..711 

1 11 21 31 41 51 

I I I I I I 

ATGCCGC06C GGGAGCTGAG CXU^CCXSAG OC6CCCC0GC TCCGG6CCCC GACCCCTGCC 60 

C0GC66CGGC GTAGOGCGCC CCCAGAOCTG GOCATCRAGT GCGTGCTGGT GGGOGA0C3GC 120 

GCXS3TGGGCA AGAGCAGCCT CATCGTCAGC TACACCTGCA ATGGGTACCC CGOGOGCTAC 180 

CGGCCCACTG OGCTGGACAC CTTCTCTGTG CAAGTCCTGG TGGATGGAGC TCC36GTGCX5C 240 

ATTGAGCTCT 60GACACAGC GG6ACAGGAG GATTTTGAOC GACTTCGTTC CCTTTGCTAC 300 

OOSGATAOOS ATGrCTTCXTT GGGGTGCTIC AGCGTGGTGC AGCCCAGCTC CTTTCAAAAC 360 

ATCACAGAGA AATGGCTGCC OGAGATCXSSC AOGCACAACC CCCftGGOGCC TGTGCTGCTQ 420 

GTGGGCACCC AGGCCGACCT GAGGGAOGAT GTCAAOGTAC TAATTCA6CT 6QACCAGGG6 480 

GGCCGGGAGG GCCCOGTGCC CCAACCCCAG GCTCA6GGTC TGGCCGAGAA GATCOGAGCC 540 

TGCTGCTACC TTGAGTGCTC AGCCTTGAOQ CAGAAGAACT TGAAGGAAGT ATTTGACT06 600 

GCTATTCTCA GTGCCATrGA 6CACAAA60C 0GGCTGGA6A AGAAACTGAA TGCCAAAGGT 660 
GT0CX5CACCC TCTCCOGCTQ C0GCT6GAAG AAGTTCTTCT GCTTOGTTTG A 

Seq ID NO: 487 Protein sequence 
Protein Accession ftt XP_063832.1 

1 11 21 31 41 51 

I I I I ( I 

MPPRELSEAE PPPIiRAPTPP PRRRSAPPEL GIKCVLVGDG AVGKSSLIVS VTCNGYPARY 60 

RPTALDTFSV QVliVDGAPVR lELHDTAGQB DFDRLRSLCY PDTDVFLACP SWQPSSFQN 120 

Z7BKWLPBZR THNFQAPVIiL VCTQADLRDD VHVLIQIAQG GRE6PVPQPQ AQGLASKIRA 180 
CCYLBCSALT QXSLKEVFDS AILSAIEBKA RLEKKLKAKG VRTIiSRCRWK KFFCFV 

Seq ID KO: 488 DNA sequence 

nucleic Acid Accession ft: NH_014398.1 

Coding sequence i 64.. 1314 

1 11 21 31 41 51 

I I I I 1 i 

GGCACCBftTT OGGGGCCTGC CC66ACTT06 CCQCA06CTG CAGAAOCTGG CCCA GOGCCC 60 

ACCAIGCCOC GOCAOCTCAB CBGGGOGGCC 6GQCTCTTC0 C6TGCCTGGC GGTAATTTT6 120 

CACQATG6CA 6TCAAATGAG AGCAAAA6CA TTTCCA6IUVA CCAGAGATTA TTCTCAACCT 180 

ACTGCAGCAG CAACAGTACA GGACATAAAA AAAOCTGTCC AGCAACCAGC TAAGCAAGCA 240 

CCTCACCAAA CTTTAGCAGC AAGATTCATG GATGGTCATA TCACCTTTCA AAC3VG0GGCC 300 

ACA6TAAAAA TTCCAACAAC TAOOCCAGCA ACTACAAAAA ACACTGCAAC CACCAGCCCA 360 

ATTACCTACA CCCTGGTCAC AAGCCA66CC ACACOCAACA ACTCACACAC AGCTCCTCCA 420 

GTTACTGAAG TTACAOTOGG CCCTAGCTTA GCCCCTTATT CACTGOCACC CACCATCACC 480 

CCACCA6CTC ATACA8CT0G AACCAGTTCA TCAACCGTCA GCCACACAAC TGGGAACACC 540 

ACTCAACCCA GTAACCAGAC CACCCTTCCA GCAACTTTAT CGATAGCACT GCACAAAAGC 600 

ACAACCGGTC AGAAGCCTGA TCAACCCACC CATGCCCCAG GAACAAOGOC AGCTGCCCAC 660 

AATACCACCC GCACAGCTGC ACCTGCCTCC AOGGTTCCTG GGCOCACCCT TGCACCtCAG 720 

CCATOSTCAG TCAAGACTGG AATTTATCAG GTTCTAAAOG GA ASCASAC T CTGTATAAAA 780 

GCAGAGATGQ GGATACAGCT GATTGTTCAA GACAAGGAGT OGOTTTTTTC ACCTOGGAGA 840 

TACTTCAACA TCGAOCCCAA CGCAACGCAA CCCTCTGGGA ACTGTGGCAC COGAAAATCC 900 

AACCTTCTGT TGAATTTTCA GGGCGGATTT GTGAATCTCA CATTTAGCAA GGATGAAGAA 960 

TCATATTATA TCAGTGAAGT 6GGAGCCTAT TTGACOGTCT CAGAT0CA6A GACAGTTTAC 1020 

CAAGGAATCA AACATGCGGT GGTGATGTTC CAQACAGCAO TOGGGCATTC CTTCAA6TGC 1080 

GTGAGTGAAC AGAQCCTCCA GTTGTCA6CC CACCTGCAGG TGAAAACAAC CGATGTCCAA 1140 

CTTCAAGOCT TTQATTTTGA AGATGACCAC TTTGGAAATQ TGGATGAGTG CT06TCTGAC 1200 

TACACAATTG TGCTTCCTGT GATTGGGGCC ATCGTGGTTG GTCTCTGCCT TAT GGGTATQ 1260 

GGTGTCTATA AAATCCGCCT AAGGTGTCAA TCATCTOQAT ACCAGAGAAT CTAATTGTT6 1320 

CCCGGGGGGA ATGAAAATAA TGGAATTTAG AGAACTCTTT CATOCCTTCC AGGATGGATG 1380 

TTGGGAAATT CCCTCAGAGT GTQOGTCCTT CAAACAATGT AAACCACCAT CTTCTATTCA 1440 

AATGAAGTGA 6TCATGT6TG ATTTAAGTTC AGGCAGCACA TCAATTTCTA AATACTTTTT 1500 

GTTTATTTTA TGAAAGATAT AGTGAGCTGT TTATTTTCTA GTTTCCTTTA GAATATTTTA 1560 

GCCACTCAAA GTCAACATTT GAGATATGTT GAATTAACAT AATATATGTA AAGTAGAATA 1620 

AGCCTTCAAA TTATAAACCA AGGGTCAATT GTAACTAATA CTACTGTGTG TGCATT6AA6 1680 

ATTTXATTTT ACOCTTQATC TTAACAAA6C CTTTGCTTTG TTATCAAATG GACTTTCAGT 1740 

GCTTTTACtA U ' Cl ' U ' WrriT ATGOTTTCAT 6TAACATACA TATTCCTGGT GTAGCACTTA 1800 

ACTCCTTTTC CACTTTAAAT TTGTTTTTGT TTTTTGAGAC GGAGrTTTCAC TCTTGTCACC 1860 

CAGOCTGGAG TACAGTGGCA 0GATCTCG6C TTATGGCAAC CTCCOCCTCC CGGGTTCAAG 1920 

TGATTCTCCT GCTTCAGCTT OCOGAGTAGC TGGGATTACA GGCACACACT ACGA CGCC TG 1980 

GCTAATTTTT GTMTTTTAT TATAGAC6GG TTTCACCATG TPGGOCAGAC TGGTCTTGAA 2040 

CTCTTGACCT CAGOTOATCC ACCCACCTCA GCCTCCCAAA GTGCTGGGA7 TACAGGOITG 2100 

AQO CA TTGOO COOQGCCTTA AATGTTTTTT TXAATCATCA AAAAGAACAA CATATCTGAQ 2160 
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C5TTGTCTAW3 TCTTTTTATG TAAAACCAAC AAAAAGAACA AATCAGCTTA TATTTTTTAT 2220 

CTTGATGACT CCTGCTCCAG AATTGCTAGA CTAAGAATTA GGTGGCTACA GATGGTAGAA 2280 

CTAAACAATA AGCAAGAOAC AATAAXAAT6 GCCCTTAATT ATTAACAAAG 7GCCAGAGTC 2340 

TAGGCTAAGC ACTTTATCTA TATCTCATTT CATTCTCACA ACTTATAAC5T GAATQAGTAA 2400 

ACTGAGACTT AAGGGAACTG AATCACTTAA ATGTCACCTG GCTAACTGAT OGCAGAGCCA 2460 

GAGCTTGAAT TCATGTTGGT CTGACATCAA GGTCTTIGGT CTTCTCCCTA CACCAAGTTA 2520 

CXTTACAACSAA CAATGACACC ACACTCTGCC TGAAGGCTCA CACCTCATAC CAGCATACGC 2580 

TCACCTTACA GGQAAATGGG TTTATCTAGG ATCATGAGAC ATTAGGGTAG ATGAAAGGAG 2640 

AGCTTTGCAG ATAACAAAAT AGCCTATCCT TAATAAATCC TCCACTCTCT GGAAGGAGAC 2700 

TGAGGGGCTT TGTAAAACAT TAGTCAGTTG CTCATTTTTA TGGGATTGCT TAGCTGGGCT 2-760 

GTAAAGATC3A AGGCATCAAA TAAACTCAAA GTATTTTTAA ATTTTTTTGA TAATAGAGAA 2820 

ACTTCOCTAA OCAACTGTTC TrTCTTGAGT GTATAGCCCC ATCTTGTGGT AACTTGCTGC 2880 

TTCTGCACTT CATATCCATA TTTCCTATTG TTCACTTTAT TCTGTAGAGC AGCCTGCCAA 2940 

GAATTTTATT TCTGCTGTTT TTTTTGCTGC TAAAGAAAGG AACTAAGTCA GGATX5TTAAC 3000 

AGAAAAGTCC ACATAACCCT AGAATTCTTA GTCAAGGAAT AATTCRAGTC AGCCTAGAGA 3060 

GCAT6TTGAC TTTCCTCATG TGTT70CTTA TC2ACTCA6TA AGTTGGCMG GTCCTGACTT 3120 
TAGTCTTAAT AAAACATTGA ATTGTAOTAA AGGTTTTTGC AATAAAAACT TACTTTGG 

Seq ID NO: 489 Protein sequence 
Protein Accession ft: NP_055213.1 

I 11 21 31 41 51 

i I I 1 I I 

MPRQLSAAAA LPASLAVILH DGSQMRAKAP PETRDYSQPT AAATVQDIKK PVQQPAKQAP 60 

RQTIiAARPMD GHITPQTAAT VKIPTTTPAT TKNTATTSPI TYTLVTTQAT PNNSHTAPPV 120 

TEVTVGPSLA PYSU>PTITP PAHTAGTSSS TVSHTT6NTT QPSMOTTLPA TLSIAIAKST 160 

TGQKPDQPTR APGTTAAABH TTSTAAPAST VPGFTLAPQP SSVKTGZYQV LNGSRIiCIKA 240 

Br4GIQLXVQ0 KESVFSPRRY FNIDPKATQA ^GNOGTRKSN LLLHFQGGFV NLTFTKDEBS 300 

YYISEVGAYL TVSDPETVYQ GIKHAWMPQ TAVGHSPKCV SEQSLQLSAH LQVKTTDVQL 360 
QAFDPEDOHF GNVDECSSDY TIVLFVIGAI WGI«CLKGMG VYKZRLROQS SGYQRZ 

5eq 10 MOt 490 SNA sequence 

nucleic Acid Accession 8t NM_005409.3 

Coding sequence* 94.. 378 " 

1 11 21 31 41 51 

1)11)1 

TTCCTTTCAT GTTCAGCATT TCTACTCCTT CCAAQAAGAO CAGCAAASCT GAAGTAGCAG 60 

CAACA6CACC AGCAGCAACA 6CAAAAAACA AACATGAGTG TGAAGGGCAT GGCTATAGCC 120 

TTGGCTGTQA TATTGT6TGC TACASTTGTT CAAG6CTTCC CCATGTTCAA AAGAGGACGC 180 

TGTCTTTGCA TACGCC C TGO GGTAAAA6CA GTGAAAGTGG CAGATATT6A GAAAGCCTCC 240 

ATAATGTACC CAAGTAACAA CTGTGACAAA ATA6AAGTGA TTATTACCCT GAAAGAAAAT 300 

AAAGGACAAC GAT6CCTAAA TCCCAAATCG AAGCAAOCAA GGCTTATAAT CAAAAAAGTT '360 

GAAAGAAAGA ATTTTTAAAA ATATCAAAAC ATATGAAGTC CTGQAAAAGO GCATCTGAAA 420 

AACCTAGAAC AAGTTTAACT GTGACTACTG AAATQACAAG AATTCTACAG TAGGAAACTG 480 

AGACTTTTCT ATGGTTTTGT GACTTTCAAC TTTT6TACAG TTATGTGAAS GATGAAAGG7 540 

GGGTGAAAGG AOCAAAAACA GAAATACAGT CTTCCTGAAT GAATGACAAT CAQAATTCCA 600 

CTGCCCAAAG GAGTCCAGCA ATTAAATCGA TTTCTAGGAA AAOCTACCTT AAGAAAGGCT 660 

GGTTACCATC GGAGTTTACA AAGTGCTTTC ACGTTCTTAC TTGTTGTATT ATACATTCAT 720 

GCATTTCTAQ GCTAGAGAAC CTTCTAGATT TGATGCTTAC AACTATTCTG TTGTGACTAT 780 

GAGAACATTT CTQTCTCTAG AAGTTATCTG TCTGTATTGA TCTTTATGCT ATATTACTAT 640 

CTGTGGTTAC AGTGGAGACA TTGACATTAT TACTGGAGTC AAGCCCTTAT AAGTCAAAAG 900 

CATCTATGTC TOGTAAAGCA TTCCTCAAAC ATTTTTTCAT GCAAATACAC ACTTCTTTCC 960 

CCAAATATCA TGTAGCACAT CAATATGTAQ OGAAACATTC TTATGCATCA TTTGGTTTGT 1020 

TTTATAACCA ATTCATTAAA TGTAATTCAT AAAATOTACT ATGAAAAAAA TTATAOGCTA 1080 

TGGGATACTO GCAACAGTGC ACATATTTCA TAACCAAATT A6CAGCAC0G GTCTTAATTT 1140 

QATGTTTTTC AACTTTTATT CATTGAGATG TTTTGAAGCA ATTAGGATAT GTOTGTTTAC 1200 

TOTACTTTTT GTTTTGATCC GTTTGTATAA ATGATAGCAA TATCTTOOAC ACATTTGAAA 1260 

TACAAAATGT TT T TGTCTAC CAAAGAAAAA TGTTGAAAAA TAAGCAAATG TATACCTAGC 1320 

AATCACTTTT ACTTTTTGTA ATTCTGTCTC TTAGAAAAAT ACATAATCTA ATCAATTTCT 1380 

TTGTTCATGC CTATATACTG TAAAATTTAG GTATACTCAA GACTAGTTTA AAGAATCAAA 1440 
GTCATTTTTT TCTCTAATAA ACTACCACAA CCTTTCTTTT TTAAAAAAAA AAA 

Seq ID NO! 491 Protein sequence 
Protein Accession #t 2IP_0 05400.1 

1 11 21 31 41 51 

I I I I I I 

MSVKQQAZAL AVIZiC3lTW0 6FPMFKRGRC LCIGPGVKAV XVADIEKASZ MrPSMNCDKZ 60 
HVZITLKENK GQRCU7PKSK QARLIIK2CVE RKEiF 

Seq ID KOt 492 DKA sequence 

Nucleic Acid Accession «i NM_000577.1 

Ooding sequexice: 41.. 520 ~ 

1 11 21 31 41 51 

I I I 1 I I 

GGCA0GAGG6 QAAOACCTCC TGTOCT A TCA GGCOCTOCCC AT66CTTTAG AGAOQATCTO 60 

CCGACCCTCT GGGA6AAAAT CCAGCAAGAT GCAAGCCTTC AGA A TCT OG O ATGTTAACCA 120 

GAAGACCTTC TATCTGAGGA ACAAOCAACT AGTTGCXXK3A TACTTGCAAG QACCAAATGT 160 

CAATTTAGAA GAAAAGATAG ATGTGGTACC CATTGAGCCT CATGCTCTGT TCTTGGGAAT -240 

CCAIGGAGG6 AAGATGTGCC TGTCCTGTGT CAAGTCTG6T GATGAGAOCA GACTCCAGCT 300 

GGAGGCA0TT AACATCACT6 ACCTOAOCGA GAACA6AAAG CAGGACAAQC GCTTOGOCTT 360 

CA7CCS6CTCA GACAGT6GCC CC31CGACCAG TTTT6AGTCT GCOG O CTGCC O OG GTTGGTT 420 

CXrrCTGCACA GOGATGGAAG CTGACCAGCC CXntZAGCCTC ACCAATATGC CT6ACGAA60 480 

OGTCATGGTC ACC3UUVTTCT ACTTCCAGGA 0GACX3AGTAG TACTGCOCAG GCCTGCCTGT 540 

■ixxx A rivrr GCATGGCAAG GACTGCAGGG ACTGCCAGTC C C CCTGCCCC AGGGCTCCXS 600 



373 



wo 02/086443 

GCTATGGGGG CACTGftGGIlC OVGOCATTGA GGGGTGGACC CTCAGAAGGC GTCACAACAft 660 

CCTGGTCACft GCTCTCTGCC TCCTCTTCRA CTGACCRGCC TCCATOCTGC CTCCWSAATO 720 

GTCTTTCTAA TGTGTGAATC AGAGCACAGC AGCCCCTGCA CAAAGCCCTT CCATGTGGCC 780 

TCTGCATTCA GGATCAAACX: CXZCACCACCT GCCCAACCTG CTCTCCTCTT GCCACTC3CCT 840 

CTTCCTCCCT CATTCCACCT TCCCATGCCC TGGATCCATC AGGCCACTTG ATGACCCCCA 900 

ACCAA6TGGC TCCCACACCC TGTTTTACAA AAAAGAAAAG AOCAGTCCAT GAG6GA66TT 960 

TTTAAGGGTT TGTGGAAAAT GAAAATTAG6 ATTTCATGAT ITmTmT CAGTOCCOST 1020 

GAAGGAGA6C CCTTCATrPG GflCATTATGT TCTTTOGGGG AGAGGCTGAG G ACTTA AAAT 1080 

ATTCCTGCAT TTGTGAAATG ATGGTGAAAG TAAGTGGTAG CTTTTCCCTT CTTTTTCTTC 1140 

TTTTTTTGT6 ATOTOXAAC TTGTAAAAAT TAAAAiGTTAT GGTACTATGT TAGCCCXa^TA 1200 

ATTTTTTTTT TOCTTTTAAA ACACTTOCAT AATCTG6ACT CXTPCXCTCCA G6CACXGCTG 1260 

CCCAGCCTCC AAGCTCCATC TCCACICCA6 ATTTTTTACA GCTGCCTGCA 6TACTTTACC 1320 

TCCTATCAGA AGTTTCTCAG CTCCCAAGGC TCTGAGCAAA TGTGGCTCCT GGGOSTTCTT 1380 

TCTTCCTCTG CTGAAGGAAT AAATTGCTCC TTOACATTGT AGAGCTTCTG GCACTTGGAG 1440 

ACTTGTATGA AAGATG6CTG TQOCTCTGOC TGTCTCCCCC ACCAG6CTGG GAGCTCTGCA 1500 

GAGCAGGAAA CATGACTOGT ATATGTCTCA GGTCCCTGCA GG6CCAAGCA CCTAGCCTOG 1560 

CTCTTGGCAG GTACTOU^CQ AATGAATGCT GTATATGTT6 GGTGCAAAGT TOOCTACTTC 1620 

CTGTGACTTC AGCTCTGTTT TAGAATAAAA TCTTGAAAA7 6CCCAAAAAA AAAAAAAAAA 1680 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAA 

Seq ID NO: 493 Protein sequence 
Protein Accession #: NP_000S6&.1 

1 11 21 31 41 51 

I I I I I I . 

MAIiETICRPS 6RKSSKKQAF RIWDVKQKTF YXiKKKQLVAG YLQGFNVNLE EKZQWPZBP 60 

HALFIiGXHGG KKOiSCVKSG DBTRLQZiBAV KZTDLSSnUC QDKRFAFIRS OSGPTTSFSS 120 
AACPGWFLCT AMBADQFVSL TNMPDEGVMV TKFYFQSDE 

Seq ID NO: 494 DNA sequence 

Nucleic Acid Accession 2OI_002081.1 

Coding sequences 222.. 1898 " 

1 11 21 ' 31 41 51 

I I I I i t 

GGCTGCCCGA GOSAGCGTTC QGACCTCQCA CCCCGCGOGC CCOGOGCCGC C3GC0GCCGCC 60 

GGCTTTTGTT GTCTCCGCCT CCTOGGCOSC CGCOGCCTCT GGACCGCGAG CCGOGCGCGC 120 

OGGGACCTTO GCTCTQCCCT TCGGGGG06G GAACTGGGCA GGACC066CC AG6ATCCGAG 180 

A6AGG0GCGQ GOGGGTGGCC GGGGGO6C06 COGGCCCOGC CATG6A6CTC OGGGCCCGAG 240 

GCTGGTGGCT GCTATGTGCG GCCGCAGOSC TGGTCGCCTG C3GCC0G0GGG GACCCX3GCCA 300 

GCAAGAGCC3G GAGCTGCGGC GAGGTCOGCC AQATCTACGG AGCCAAGGGC TTCAGCCTGA 360 

G0GAC6TGCC CCAGGCX3GAG ATCTCGGGTG AGCACCTGGG GATCTGTCCC CAGGGCTACA 420 

CCTGCTGCAC CAGCGAOATG GAGQAOAAGC T8GCCAAC0S CAGCCATGCC 6AGCTGQAGA 480 

GC30G6CTCOG 66ACAGCAGC CGOGl'CCTGC A0GCCA7GCT TGCCACCCAO CTG08CAGCT 540 

TOGATGACCA CTTCCAGCAC CTGCTGAAOG ACTOGGAGOG GACGCTGCAG GCCACCTTCC 600 

OCGGCGCCTT CGGAGAGCTG TACACGCAGA ACX«:GAGGGC CTTCOGGGAC CTGTACTCAG 660 

AGCTGCGCCT GTACTACCGC GGTGCCAACC TGCACXTGGA GGAGAOGCTO GCOGAGTTCT 720 

GOGGCOGCCT OCTGQAGCGC CTCTTCAA6C A6CT6CA00C CCAGCTGCTO CTGOCTGATG 780 

ACTACCTGGA CTGCCTOGGC AAGCAG8C0G AGG0GCT60S GCCCTTCGOG GAGGCCCOGA 840 

GAGAGCTGCG CCTGOGQGCC ACCCGTGCCT TOGTGGCTGC TOGCTCCTTT GTGCAGGGCC 900 

TGGGCGTGQC CAGOSACGTG GTCCGGAAAG TGGCTCAGGT CCCCCT6GQC CCGGAGTGCT 960 

OGAGAGCTGT CATGAAGCTG GTCTACTGTG CTCACTGCCT GGGAGTCCCC GGOGCCAGGC 1020 

GCIGCCCTOA CTATTQC06A AATGTGCTCA AGGGCTGCXT TGCCAAGCAG G00GACCTG6 1080 

AOGCOGAGTQ QA6GAACCTC CTGGACTCCA T66TGCTCAT CAC06ACAA6 TTCTOGQGTA 1140 

CATCGGGTGT G6AGAGTGTC ATCGGCA60G TGCACAOGTG GCTGGCGGAG OCCATCAAOG 1200 

CCCTCCAGGA CAACAGGGAC AOGCTCACGG CCAAGGTCAT CCAGGGCTGC GGGAACCCCA 1260 

AG6TCAACCC CCAGGGCCCT OGOCCTGAGG AGAAGCGGOG CCGGGGCAAG CTGGCCCCGC 1320 

QGGAGAOGCX: ACCTTCA60C ACGCTGGAOA AQCTGQTCTC TGAAGCCAA6 6CCCA6CTCC 1380 

QOQAOGTCCA GK3ACTTCTGQ ATCAGCCTOC CAG6GACACT GTGCAGTGAG AAGATGQCCC 1440 

TGAGCACTGC CAGTGATGAC OGCTGCTGGA ACGGGATGGC CAGAGGCCGG TACCTCCCCG 1500 

AGGTCATGGG TGAOGGCCTG GCCMi!CCPJ3k TCAACAACCC CGAGGTGGAG OTGGACATCA 1560 

CCAAGCOGGA CATGACCATC CX^GCAGCAGA TCATGCAGCT GAAGATCATG ACCAACCGGC 1620 

TG06CAG0GC CTACAAOSGC AACQAOGTGG ACTTOCAGOA G6CCA0TGAC 6A0G6CAG06 1680 

6CTCGGGCA0 0Q0T6ATGGC TOTCTQOATO ACCTCTGGQ3 CO66AA0GTC A0CA0GAA6A 1740 

QCTCCAGCTC COSGAOGCCC TTGACCCATG CCCTCOCAGG CCTGTCRGAG CAGGAAGGAC 1800 

AGAAGACCTC GGCTGCCAGC TGCCCCCAGC CCCCGACCTT CCTCCTGCCC CTCCTCCTCT 1860 

TCCTGGCXXrr TACAGTAGCC AGGCCCG6GT GGOGGTAACT GCCCCAAGGC CCCA6GGACA 1920 

GAGGCCAAG6 ACTGACmO CCAAAAATAC AA C ACAGACO ATATTTAATT CACCTCAOCC 1980 

TGGAGAGGCC TGQGGTG66A CAGGGAG6GC 06GGG6CTCT GA6CAGGGGC AGGGGCAGA6 2040 

OTCCCAOCCC C3«3GCCTGOC CTCX3CCTGCC TTTCTGCCTT TTAATTTTGT ATGAGGTCCT 2100 

CAOGTCAGCT GGGAGCCAGT GTGCCCAAAA GCCATGTATT TCA6GGACCT CA6GGGCACC 2160 

TCCGGCTGCC TA6CCCTCCC CCCAGCIOCC TGCACOGCOG CAGAAGCAGC OCCTOGAGGC 2220 

CTACAGAGGA 6G0CTCAAAG CAACCOBCTO GAGCCCACAO GQAOCCTOTO OCTTCCTCCC 2260 

OGCCTCCTCC CACTGGGACT CCCAGCAQAO OCCACCAGCC A0CCCTG6CC CACCCCCCAQ 2340 

CCTCCAGAGA AGCCCCX3CAC GGGCTGTCTG GGTGTCOGCC ATCCAGGGTC TGGCAGAGCC 2400 

TCTGAGATGA TGCATXSATGC CCTCCCCTCA GCGCAGGCTG CAGAGCCCGG CCCCACCTCC 2460 

CTGOGCCCTT GAGGGGCCCC AG06TCTGCA GGGTGA06CC TGAGACAGCA OCACT G CTGA 2520 

GGAGTCTGAG GACTGTCCTC CCACA6ACCC TGCAOTQAOG GGOOCTCCAT 6GGCAGATGA 2580 

GGGGCCAC7G ACCCACCTGC GCTTCT G CTG GAG6A60GGA ASCTGGOCCC AAAG6CCC3U3 2640 

GGAGGCAGOG TGGGCTCTGC CAATG TOG GC TGCOOCTCGC ACACAGG6CT CACAGGGOVG 2700 

GCCTTCCPGG GGTCXAGGGC TGTTGGAGGA CCCOSAGGGC TCAGGAGCAG CCAGGACCCG 2760 

CCTGCTC C CA TCCTCACCCA GATCA6GAAC CAGGGCCTOC CT6TTCAGGG T GACACA GGT 2820 

CAGG6CTCAG AGTGACCCTC GGCTGTCACC TGCTCAGAGQ QAIGCTGGTG OCTOQTOAGA 2880 

CCCOGCACTG CACACGGGAA TGCCTAGGTC OCTTOCOGAC CCAGOCAGCT GCACTGCAOG 2940 

GCACGGGGAC CTGGATAGTT AAGG6CTTTT CCAAACATGC ATCCATTTAC TGACACTTCC 3000 

TGTCCTTGTT CATGGA6AGC TGTTOGCTCC TCCCAGATGG CTTCGGAGGC COGCAGGGOC 3060 

CACCTTGGAC 0CTGGT6ACC TCCTGTCACT CACTGAGGCC ATCAGGGCCC TGCCCC3W3GC 3120 
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CTGGAOSGGC CCTCCTTCXX: TCCTGTGCCC CAGCTGCCAG GTGGCOCTGG GGAGGGWTGG 3180 

TGTGGT6TTG GGAAGGGGTC CTGCAGGGGG AGGAGGACTT GGAG GSTCTG GGGGCAGCTG 3240 

TCCTGAACCG ACTGACCCTG AOGftGGCCGC TTAGTGCTGC TTTGCTTTTC ATCACOCTCC 3300 

CGCACAGTGG ACX3GAGGTCC C08GTT6CTG GTCAG6TCCC CATGGCTTGT TCTCTGGAAC 3360 

CTGACTTTAG ATGTTTTGGG ATCAGGAGCC COCAACACRG 6CAAGTCCAC CCCATAATAA 3420 

CC3CTGCCAGT GCCAGGGTG6 GCTGGGGACT CTGGCACAGT GATGCOGGGC G<XAGGACAG 3480 

CAGCACTCCC 6CTGCACACA GACGGCCTAG GCSGTGGOGCT CAGACOCCAC CCTAOGCTCA 3 540 

TCTCTGGAAG GGGCAGCCCT GAGTGGTCA.C TGGTC3U3GGC AGTG GOCAAG CXTGCTGTGT 3600 

CCTTCCTCCA CAAC5GTCCCC CCACCGCTCA GTGTCA6086 GTCACGT6TG TTCTTTTGAG 3660 
TCCTTCTATO AATAAAAGGC TGGAAACCTA AA 

Seg ZD NO: 495 Protein sequence 
Protein Accession 8: NP_002072.l 

1 11 21 31 41 SI 

I I I I I I 

MBIiRARGWHL liCAAAALVAC ARGDPASKSR SCX3EVRQIYG AiCGFSLSDVP QABISGSHLR 60 

ICPQGVTCCT SEMEHHiANR SHABLETALR DSSRVLQAML ATQLRSFDDH FQHLLNDSER 120 

TLQATPPGAP GELYTQNARA FRDLYSELRIj YYRGANLHLE ETLAEFMARL LERLFKQLHP 180 

QLLLPDDYLD CLGKQAEAItR PFGEAPRELR LRATRAPVAA RSFVQGLGVA SDWRKVAQV 240 

PLOPfiCSRAV MRLVYCAHOj GVFGARPCPD YCRHVLKGCL AK0ADU3AEH SNLLDSKVLZ 300 

TDKFVK3TS6V ESVIQSVBTH XAEAXMAICD NRDTLTAKVX QGGQIIPKVKP Q6FQPEEKRR 360 

RGKZjAPRERP PSGTLEKLVS EAKAQUtDVQ DFHZ8LPGTL CSEKKALSTA SODRCHNOMA 420 

RGRYLPBVMG DGIANQXMNP EVBVDITKPD MTXRQQIMQL KIHTNRLRSA YNGNDVDPQD 480 
ASDDGSGSGS 

Seq ID NO: 496 DMA sequence 

Ntacleic Acid Accession Is NM_001650-.2 • - 

Coding sequences 40.1011 

1 11 21 31 41 51 

I I I I 1 I 

GGGGCAGGCA ATGAGAGCTG CACTCTGGCT GGGGAAGGCA TGAGTGACAG AC CCAC AGCA 60 

AGGCGGTGGG GTAAGTGTGG ACCTTTGTGT ACCAGAGAGA ACATCATGGT GGCTTTCAAA 120 

GGGGTCTGGA CTCRAGCTTT CTGGAAAGCA GTCACAGOGG AATTTCTGGC CA TGCT TATT 180 

TTTGTTCTCC TCA6CCTGG0 ATOCAOCATC AACTQGGGTQ QAACAGAAAA 6CCTTTACC6 240 

OTOGACATGO TTCTCATCTC CCTTTGCTTT GGACTCASCA TTGCAACCAT GSTGCRGTGC 300 

TTTGGCCATA TCAGCGGTGG CCACATCAAC CCT6CAGTGA CTGTGGCCAT GGTGTGCACC 360 

AGGAAGATCA GCATCGCCAA GTCTGTCTTC TACATCGCA6 CCCAGTGCCT GGGGGCCATC 420 

ATTOGAGCAG GAATCXTTCTA TCTGGTCACA CCTCCCA6TO TG6TGGGAGG CCTGGGAGTC 480 

AOCATGQTTC ATGGAAATCT TAOOSCTGOT CATGGTCTCC TGGTT6AGTT GATAATCACA 540 

TTTCAATTGG TOTTTACTAT CTTTGCCAGC TGTGATTOCA AAOGGACTGA TGTCACTG6C 600 

TCAATAGCTT TAGCAATTGG ATTTTCTGTT GCAATTGGAC ATTTATTPGC AATCAATTAT 660 

ACTCGTGCCA GCATGAATCC CGCCCXSATCC TTTGGACCTG CAGTTATCAT GGGAAATTGG 720 

GAAAACCATT GGATATATTG GGTTGGGCCC ATCATAGGAO CIGTCCTCGC TGGTQGCCTT 780 

TATQAGTAT8 TCTTCTGTCC AGATGTTGAA TTCAAAGGTC GriTTTAAAGA AGCCTTCAGC 640 

AAA6CT6CCC AGCAAACAAA AGGAAGCTAC ATGGAGGT6G AGQACAACAG QA6TCAG6TA 900 

GAGACGGATG ACCTGATTCT AAAACCTGGA GTGGTGCATG T6ATTGACGT T6ACCGGGGA 960 

GAGGAGAAGA AGGGGAAAGA CCAATCTGGA GAGGTATTGT CTTCAGTATG ACTAGAAGAT 1020 

CGCaCTGAAA GCAOACAAGA CTCCTTA6AA CTGTCCTCAG ATTTCCTTCC ACCCATTAAG 1080 

GAAACAOATT TGTTATAAAT TAGAAAIGTO CAGGTTT6TT GTTTCATGTC AXATTACTCA 1140 

GTCTAAACAA TAAATATTTC ATAATTTACA AAGGA6QAAC 6GAAGAAACC TATTGT6AAT 1200 

TCCAAATCTA AAAAAAQAAA TATTTTTAAG ATGTTCTTAA 6CAAATATAT ACCTATTTTA 1260 

TCTAGTTACC TTTCATTAAC AACCAATTTT AACCGTGTGT CAAGATTTGG TTAAGTCTTQ 1320 

OCTGACAGAA CTCAAAGACA CGTCTATCAG CTTATTCCTT CTCTACT06A ATATTGGTAT 1380 
AGTCAATTCT TATTTGAATA TTTATTCTAT TAAACTGAGT TTAAdAATGG C 

Seq ZD NO I 497 Protein sequence 
Protein Accession i: NP_001641.1 

1 11 21 31 41 51 

I I I I I I 

MSDRPTARRW GKOGPLCTRB NIMVAFK6VW TQAFWKAVTA EPLAMLIPVL LSLGSTINWG 60 

GTBKPLPVDM VLISLCPGLS lATMVQCPGH ZSGGHINPAV TVAMVCTRKI SIAKSVFYIA 120 

AQOXSAIIGA GILYLVTPPS WGGliGVTMV HGNLTACffiGL IjVELIITFQL VFTIFASCDS 180 

KRTUVTGSIA LAIGFSVAIG HLFAINYTGA SMNPARSFGP AVIMGNWENH NZyfrVGPZIG 240 

AVZiAGGLYEY VFCPOVEFKR RFKEAFSKAA QOTKOSYMBV EDKRSQVETD DLZUCP6WH 300 
VZDVDRGEEX KGKDQSGEVIi SSV 

Seq ID NO: 498 DKA sequence 

Nucleic Acid Accession 8: AB020684.1 

Coding sequence > 1 . . 1744 

1 11 21 31 41 51 

» ' ' II! 

CCCCCTTGTC ATTAATACAT TAAAAAGATT CAATCTTTAC CCTGAGGTAA TTTTGGCCAG 60 

TTG6TAC0GG ATTTATAOCA AAATAATGGA CTTGATTOOT ATTCAAACCA AGATATGTT6 120 

GAOGGTTACC AGAGGAGAAG GACTCAGTCC TATTGAAAGC TGTGAAGGAT TGGGAGATCC 180 

TGCTTGCTTT TATSTTOCTG TAATTTTTAT TTTAAATGGA CTAATGATGG CATTATTCTT 240 

CATATATGGC ACATATTTAA GTGGCAGCOG ATTAGGAGGC CTGGTTACAG TGTTGTGCTT 300 

CTTTTTCAAT CATGGAGAGT GTACCOGTGT AATGTGGACA CCACCTCTCC GTGAAAGCTT 360 

CTCATATCCA TTTCTTGTTC TTCAGATGTT GCTAGTGACT CATATTCTCA GGGCTACAAA 420 

ACTTTATAGAk GGAAGCITGA TTGCACTCIG CATTTCCIVAT GTATIYITCA 1GCTTCCTTQ 480 

GCAGTTTGCT QUSTTTCTAC TTCTTACTCA GATTGGATCA TTATTT6CAS TATATGTTGT 540 

06GGTACATT GATATATGTA AATTACGGAA GATCATTTAT ATACACATGA TTTCTCTTGC 600 

ACTTTGTnT GTTTTGATGT TTGGGAACTC AATGTTATTA ACTTCTTATT ATGCTTCTTC 660 

TTTGGTAATT ATTTGGGGTA TTCDGGCAAT GAAACCACAT TTCCTGAAAA TAAAT6TATC 720 
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TGAACTTAGT TTATGGGTTA TTCAAG6AT6 TrmtjUTrA TTTGGAACT6 TCATACTTAA 780 

ATACTTGACA TCTAAAATTT TTX3GTATTOC AGATGA06CT CATATTGGCA ACTTACTAAC 840 

ATCAAAATTC TTTAGTTATA AGGATTTTGA TACTTTATTG TATACCTGTG CAGOGGAGTT 900 

TGACTTTATG GAAAAAGAGA CTCCACTGAG ATACACAAAG ACATTATTGC TTCCAGTTGT 960 

TCTT6TAGTG TTTGTT6CTA TTGTTAGAAA GATTATTAGT GATATGTGGG GTGTCTTAGC 1020 

TAAACAACA6 ACACATGTAA GAAAACAOA GTTTGATCAT GGAGAGCTGG TTTACCATGC 1080 

ATTGCAATT6 TTA6CATATA CA6CCCTTGG TATTTTAATT ATGAGACTAA AACTCTTCTT 1140 

GACAC3CACAC ATGTGTGTTA TGGCATCACT GATCTGCTCA AGACAQCTAT TTGCSATGGCT 1200 

CTTTTGCAAA GTACATCCTG GTGCTATTXn* GTTTGCTATA TTAGCAGCAA TGTCAATACA 1260 

AGGTTCA6CA AATCTGCAAA CCCA6TG6AA TATTGTACG6 GAGTTCA6CA ATTT6CCCCA 1320 

AGAAGAACTT ATASAATOGA TCAAATATAG TACTAAACCA GATQCAGTGT TT60QGGTGC 1380 

CATGCCCAOS ATGGCAAGTG TTAAGCTCTC T6CACTTCG0 CCCATTGT6A ATCATCCACA 1440 

TTATGAAGAC GCAGGCTTQA GA6CCAGAAC AAAAATAGTA TACTCAATGT ATAGTC6GAA 1500 

AGCA6C06AA GAAG3X»AGC GAGAACTGAT AAAGTTAAAA GTGAACTATT ACATTCTAGA 1560 

AGA6TCATGG TGTGTAAGAA GATCCAA60C TGGTTGCAGT ATGCCTGAAA TTTGGGATGT 1620 

A6AAGATGCT GCCAATGCTO GGAAAACTCC CTTATGTAAC CTCTTGGT6A A6QATTCCAA 1680 

ACCTCACTTC ACCACTGTAT TGCAGAACAG T6TTTAC3VAA GTGCTAGAAG TTGTAAAAGA 1740 

ATGACTGCTA CATGACCTGC TGCCTAOGGA GAACTACATC TGTAATGGTT TTAAT6TTTT 1800 

GCTAAGTCAT GTGTTGTTCA TATCCCAAAA ACTTTTATAG GTAACTGTTT TCAAATAGAA 1860 

AAOGTTTTAT TTGGTCAATT TGAATGTCAT TCTAATTATA AAAATGACTT ACACCTTTAT 1920 

CAA7T6GTTA CTATTTCAAT GCACCCTTTA AAATTTGCTA TGCAAATGAG TATATGCT7G 1980 

TACTT6ACTT TAATATTTGT GCTAAAGTGA GCAAAGCTAC CTGTATAAAG AAAACACAGT 2040 

GGGTTGTGAC AAGGATGACA TGAAAATACA GGACAATTCT GACAATGTAG GG6CTGATTT 2100 

TATAGTGTAA GAACTATTAA TQCCCCTTOC TTCTTTTTTC TGCCTCTTGC TCTTGTCTTT 2160 

T6GACATTTC A6TGATTGTA AGTTCTTOGa TCATGTCAGC CCCTGTCATC AACTTGAGTT 2220 

ACAQTAGATO GGGCAGACAT GGAGTQTTTG CTATATAAAA CTATCTGTTT GTTTTACTTC 2280 

CTTGTGCGCT TTTTGTTCTC 'iV n VIV rm TTAATGAAGC TTTTCCTGCC CATTATTAAT 2340 

OCAAACTCTT GGACCTTGTG GTTAGGAAAT TCCCTtAACT TCCAGCCATA TGGCATTATC 2400 

GTGTCTCTTT CTCTCTCTCT CTTGCTCTCT CTCTTCTCCT CTTCCCCATA TTTTCTGTCA 2460 

AATAAGTACT GTTTACTCAT TTAGTTGCTT ATCAAGTACT TATTCTTGGT TTTAAAAAAA 2520 

ATTAATGGTA ACrGTATTTT TCTCATTTTT AGCATTATTC AAATCTTTAT ATTTTAATAC 2580 

CTTTAAACCA CTTTAAAGTT rrri'CATGTT TAATTATAST TTTAAGAAAA ACTATTTTGA 2640 

ACAACOCCAA ATATAGTGCA TCTASAAACT AATGTATATT TGATTAGACA TCATTTATAG 2700 

TGGAACASTA GACTGTAGTA CATGGTAATT TTTCTTTTAC TATTAAGATA CAATAAAACA 2760 

TGACTAATTT TXSCTGTCAAA AATGTAAAQA ATAATGATAA ATX3GAGTTTT TTATATTTTA 2820 

CTTTTAAGAT TGCCTGTCTT TAATAAGACA AAGCCTTAAG CCTTATGTTA TAATTTTGGT 2880 

TCTAAAAACC ATCATTTCAG TATAAQGAAT AAGTATATTT OGTCCTCCTC TTTAGTTTTT 2940 

TTCTTCCTAT TTATTTTTAT TTTGAAAAAT TTCTACACCT TCTTTGAATT OCTTC3TAT6A 3000 

ATTTTTGTTT CTTAGAAGTT AA T TTOTGTG AAATGAOATT CTTCAAAA08 ATGAAAOCTC 3060 

ATAGCTCTGA GAAAAGGTTT TAGGGnTTA AATTCTAAQC AAAQOSTGAC TATGOCTGAC 3120 

AGACTACACA TTTAATTATA CAGCTTCTCT TTCTTAACCA CAGGCAGATT AACCTCATTG 3180 

TGGATTGTCC TTCAGACCTT AGTOCTCAGG CATC6TTTCT GGTGCCCACr CCTGGAAGOC 3240 

0CT6TTCCCT TTCTACCTTC TTACCAGA6C GCAAGGGGAQ GGCTGGTCCX GGGGAAGCAG 3300 

CAGCTT6CTG ACATAA6TCA GCTGCAAAOG CTGAGGAGTG T6CCCTCASA GAA8CAC06C 3360 

CCCCCAGTCT TGTGCCAGOG CCTAGAGCOG CAfiCTCCCAa QQATGCTCCT TCCCTGGAGQ 3420 

• CAGCCCAGGA GAGGGACTCT OGCAGOGTTC TTCAGATTTG TGGCCACTGT TTCTCATTTG 3480 

CTGCTTGACT OTTTTrATTT CTTAGGCTTT TGCTAiGTTTT AGAAAATAGG GAAGCAGCCC 3540 

TTOATTTOTG GA7TAAAAGC AACATTT6A0 GGATGATGCA CAACAGTCCA G8AAAATGG0 3600 

CGGTGGACAC TTGAGGCTGA GQA7GG6AGT TGACAT6AGC AOGGAGAGGG AGQTGCG08C 3660 

TGCTTATCTG TGATTGTT6C TCACCTGAGT GTGGCTGATT GTGTACATCC AOCAGTTACA 3720 

ATTTTTAAAA ATTATACTTT TACATTTATT TTATATTTTT CTCACCCCCA GTAATTTCCT 3780 

TCCAAAGAAG TTCACATQTA AZAAGTAGAA ATTCTGTATA 66AAAAAAGC ATTAAAAATA 3840 

CTATTATAAC TGCTTCATTT G CTOGG AAOC ATTAA AAOTA ATATAAATTA GCTTTTTCCA 3900 

GAAGGATOCr TTTGTAGCA6 TGTTTATC3AA TG7AA00C0C A6CAAAAXAT GGCTATATAT 3960 

TAGGGGAGCC AGTTT6GAGC AGAGGCCTGA AGGTCCCTGC TATGCAGCOG TGOCCACAGC 4020 

TCGCAGCCCA AGCACTGTGG AGCATCCACA CCTTTGATGO CAATGCAGAT TGGTAGCA6G 4080 

TTCCATAGGC GTACAAAACA GTATTAAAGC TCAQTGTTTT OCATATTGTT AGCATTTACA 4140 

AATATTTTTO C TTTAQ TATq AQGAA AOTAA OC AIGQG CAA AGAAGOGATC AAAATA6CTA 4200 

TTGCTACAAC ATTTT06AAA ACAAAGTTQ6 GGCTGTATTT CTTTAAAAA6 ATAAGCCTCT 4260 

AAAAATGCTT QGCAAAAAAA ATATAGTGTT AAAATAGGOC AGTGATATTA ATGAGAAAAT 4320 

GAAAGTATGT ATCAGGAATA AAGTGATATT GCATAGGAGT ArrGTATTTT TATGAATTTT 4380 
AT60CA6TT6 TTTACATGTA CTATATATGT TAAATTAAAA AAAATCATGA GAAATG 

Seq ZD KOt 499 Protein sequence 
Protein Accession 8: BAA74900.1 

1 11 21 31 41 51 

I 1 I i I t 

PLVZIITLKRF NLYPEVIIAS WyRIYTKIMD LXGIQTiaCW TVTRGBGZiSP XBSCBGLGDP 60 

ACFYVAVIFI LNOLMMALPP lYGTYLSGSR LGGLVTVLCP FPNHGECTRV KWTPPLRESP 120 

SYPFLVLQML LVTHILRATK LYRGSLIALC ISKVFPKLPW QFAQFVLLTQ lASLFAVYW 180 

GYIDZCKUUC IZyiHMZSLA LCFVLMFQIS MLLTSYYASS LVZZH6ILAN KPHFZiKJMVS 240 

BLSLHVZQ6C PlfLFGTVILK YLTSRIPGIA SDAHKBRiLT SKFFSYKDFO TLLYTCAASP 300 

DFHBKETPLR YTRTliLLPW LWPVAZVRK ZISDKW6VLA XQQTBVRKHQ FDHGSLWBA 360 

LQUiAYTALO IliIHRLKIiFL TPHKCVMASL XCSRQLFGWL FCKVHPGAIV FAIZiAAMSIQ 420 

GSAmiQTQWN IVGEPSNLPQ EELXEHItOrS TKPDAVFAGA KPIMASVKLS ALRPZVNHPH 480 

YEDAGLRART KXVYSMYSRK AAEEVKRELZ KLKVNyYILB ESHCVRRSKP GCSMPEINDV 540 
BDPANAGKTP LCMLLVKDSK PHFTTVFQNS VncVLEWRB 

Seq ZD SO: SOO DNA sequence 

Nucleic Acid Accession ft: NH_001276.1 

Coding sequence: 127.. 1278 " 

1 XI 21 31 41 51 

I 1 t I I i 

AGTGGAGT6G GACAGGTATA TAAAGGAAGT ACAGGGCCTG GGGAAGAG6C CCTGTCTAGG 60 

TAGCTGGCAC CA0GAGCG6T GG6CAA6G6A AGAG60CACA OOCTOCC C T G CTCTGCT6CA 120 
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GCCAGAAIGG GZGTGAA06C GTCTCAAACA G6CTTTGTGG TCCTGGrTGCT 
T6CTCXGCAT ACAAACT6GT CTGCTACTAC ACCAGCTCGT CCCA6TAC0G 
GGGAGCTGCT TCCCAGATGC CCTTQACOGC TTCCTCTGTA CCCACATCAT 
GCCAATATAA GCAACC5ATCA CATCQACACC TGGGAGTGCa ATGATGTGAC 
ATGCTCAACA CACTCAA6AA CAG6AACCCC AACCTGAAGA CTCTCTTGTC 
TGGAACTTTG GGTC7CAAAG ATTTTCCAAG ATAGOCTOCA ACACCCAGAO 
TTCATCAAGT CAGTACCGOC ATT0CTG06C ACCGATGGCT TT6AT6GGCT 
TGGCTCTACC CTGGACGGAG AGACAAACAG CATTTTACCA CCCTAATCAA 
GCCGAATTTA TAAAGGAAGC CCAGCCAGGG AAAAAGCAGC TCCTGCTCAG 
TCTGOGGGGA AG6TC31CCAT IX3ACA6CAGC TA7GACAZT6 CCAAGATATC 
GATTTCATTA GCATCAT6AC CTAOGATTTT CA3GGAG0CT GG05IG6GAC 
CACAGTCCCC TGTTC06A66 TCAG6AGGAT 6CAAGT0CTG ACA6ATTCAC 
TATGCTGTGG GGTACATGTT GAGGCTGGGG GCTCCTGCCA GTAAGCTGGT 

cccaccttog ggaggagctt cactctggct tcttctgaga ctggtgttgo 
TCAGGACCGG gaattccagg ccggttcacc aagqaggcag ggacccttgc 

ATCIOTQACT TCCTCC6CG0 AGCCACAOTC • CATAGAAOCC TCGGCCA6CA 
6CCACCAAGG GCAACCAGTG GGTA6GATAC GA06ACCAGG AAAGOGTCAA 
CAGTACCTGA AGGATAGGCA GCTGGCAGGC GCCATGGTAT GGGCCCTGGA 
TTCCAGGGCT CCTTCTGCGG CCAGGATCTG CGCTTCCCTC TCACCAATGC 
GCACTCGCTG CAAC6TAGCC CTCTGTTCTO CACACAGCAC 6GQGGCCAAG 
GCCCCTCTGG CTCCAGCTGG COGGGAOCCT GATCACCTOC CCTGCTGAGT 
GCCTCA6TCT CCCTCCCTTG G6GCCTATGC AGACGTCCAC AACACACAGA 
GCOCTGGTGG GCAGAGAGGT AGGGATCGGG CTGW3GGGAT AGTGAGGCAT 
GACTCGGGAT TAGTACACAC TTGTTGATGA TTAATGGAAA TGTTTACAGA 
TGGCAAGGGA ATTTCTTCAA CTCCCTGCCC CCTAGCCCTC CTTATCAAAG 
T6GCAAGCTC TATCACCAAG GAGCCAAACA TCCTACAAGA CACAGTGACC 
TAOXCCTGC AAAGCCAGCT TGAAACCTTC ACTTAGGAAC GTAATCGTGT 
ACTTCCCCTT CCTAATTCXav CAGCTGCTCA ATAAAGTACA AGAGTTTAAC 
OGCTTTGCTT TGGTCTATCT TTGAGCGCCC ACTAGACCCA CTGGACTCAC 
TCTTCTGGGT TCCTTCCTCT GAGCCTTGGG ACCCCTGAGC TT6CA6AGAT 
ATGTT 
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Seq ZD VSOt 501 Protein seguenee 
Protein Accession S: MP.001267.1 



1 
I 

H3VKASQTGF 

KSVPPFLRTH 
GKVTIDSSYD 
VGYMLRLGAP 
DFLRGATVKR 
GSFCXS^IiRF 



11 
I 

WLVLLQCCS 
HHZ^VTLYGML 
GFDGLDLAWL 
lAKISQHLDP 
ASKLVMGIPT 
TLGQQVPYAT 
PL7HAZKDAL 



21 
I 

AYKLVCmS 
NTUCMRNPNL 
YPGRRDKQHF 
ISIMTYDFHG 
FGRSFTLASS 
KGNOWVOYDD 
AAT 



31 
I 

WSQYRBGDG8 
KTLLSVGGHN 

TTLIK94KAE 
AWRGTTQHHS 
STGVGAPISG 
QBSVXSKVQY 



41 
I 

CFPDALDRFL 
FGSQRFSKIA 
FIKEAQPGKK 
PLFRGQEDA8 
PGIPGRFTKE 
XiKXSRQIAGAM 



Seq ID NO: 502 DKA sequence 

HucXelc Acid Accession §; NM_006474. 

Coding sequeneet 161.. 669 



1 
I 

GCTGCCTAGG 
TC06GCCCCC 
nCCCCCAGC 
ATGTGGAAGG 
GAAGGAGCCA 
GTTGCCATGC 
AA6TCTG6CT 
GAGGATCTOC 
GCCTCAAACG 
GTTGAGAAAG 
GCCATCG6TT 
.T06CCCXAAA 
TTCTGACTCT 
GGG6CCCATT 
TCACCA6ATT 



11 

I 

GTCTGGAAAO 
OCACOGTCOC 
TCAGAATCTT 
TGTCAGCTCT 
GCACAGGCCA 
GA6GT6CCX3A 
TGACAACTCT 
CAACTTCAGA 
TGGCXaVCCAG 
ATGGTTTGTC 
TCATTGGTG6 
GAGCTGAAGG 
GZG6CCCTGT 
CAGATTCCAC 
TGGTTCTTAA 



21 
[ 

CTCX3GGCACC 
QCTCCTOCAa 
GCTGCT066C 
GCTCTTOGTT 
GCCAGAAGAT 
AGATGA7GTG 
GGTGGCAACA 
AAGCACAGTC 
TCACTCCAOG 
AACAGTGACC 
AATCATOGTT 
GTTA06CCCT 
CCCTGAGCTC 
GGTGACTTTC 
ACTTT 



31 

I 

CTCCCTCTCC 
GCTG6 G0 CT0 
CCCCAGGAGA 
TTGGGAAGOG 
GACACTGAGA 
GTGACTGCAG 
AGTGTCAACA 
CA0606CAA0 
GAGAAAGTGG 
CTGGTTGGAA 
GTG6TTATGC 
GCTTGCCAAC 
GTGGGGAOAA 

c cmm ccAA 



TGGCCGOGGT 
GCAACAACTC 
OGTOGCTCTG 
CTACAGGTTT 
GAACCAG06A 
GrraTAACAGS 
AACAAAGTCC 
ATGGAGACAC 
TCATAGTTGG 
GAAAAATQTC 
GT6CTTTAAA 
GAT6ACCCT0 
ATTAAC06A0 



GCTCQUSTGC 
GGAAGG06AT 
CTACAGCTTT 
GCTCTAOGGC 
TGTCGGAGGA 
T06C0G6ACT 
G6ACCTTGCC 
GGAAATGAAG 
OGCAGCACTG 
CCAACACCrS 
CACAOGCCAT 
CAACACT6AC 
GATGGGCATC 
AGCCCCAATC 
CTACTATGAG 
GGTCCCCTAT 
AAGCAAGGTG 
OCTGGATGAC 
CATCAAGGAT 
GATGCCCOGT 
CCCAGGCTGA 
TTTGAGCTCA 
OGCAATGTAA 
TCCCCAAGCC 
GACACCATTT 
ATACTAATTA 
CCCCTATCCT 
AGTGTGTTGG 
CTCCCCCATC 
GAA66C06CC 



51 
I 

CIHIIYSFAN 
fiNTQSRRTFI 
QLLLSAAIiSA 
PDRFSNTDYA 
AGTLAYYGIC 
VHALDLOOFQ 



51 
[ 

CTCCCACCCC 
GCTTTTAATT 
AACGG6AAC6 
GGTCCTGGCA 
6GAAGGCX30C 
AGACOGCTAT 
CATT06CATC 
AAGOGCCACA 
ACAGACAACA 
GGTCTTACTA 
GGGAAGGTAC 
AAAAGAOOGT 
GGAACATTT6 
GAAA6A0CTT 



Seq 10 NOt 503 Protein sequence 
Protein Accession St NP_00646S.l 

1 11 21 31 41 51 

I I I I I I 

MHKVSAIiLFV LGSASLWVIA BGASTGQPED DTETTGLBGG VAKP6ABDDV VTPGTSSDRY 
KSGLTTLVAT SVNSVT6ZRI EDLPTSESTV HAQEQSPSAT ASMVATSEST EKVDGDTQTT 
VEKDGLSTVT LVGIIVGVLL AIGPIGOIIV WMRKMSQRY SP 

Seq ID NO: 504 DNA sequence . 

Nucleic Acid Accession #t Eos sequence 

Coding sequence: 62..B9S 

1 11 21 31 41 51 

I I I I I I 

CACIGCTCTG AGAATTTGTQ AGCAOCCOCT AACAGQCTST TACTTCACTA CAACTGAOGA 
TATGATCATC TTAATTTACT TATTTCTCTT GCTATOQGAA GACACTCAAG GATGGGGATT 
CAAGGATGGA ATTTTTCATA ACTCCATATO OCTTGAAOQA GCAGCOGGTG TGTACCACAG 
AGAAGCACGG TCTGGCAAAT ACAAGCTCAC CTACGCAGAA GCTAAGGOGG TGTGTGAATT 
T6AAGG0G6C CATCTOGCAA CTTACAAGCA GCTAGAGGCA GCCAGAAAAA TTGGATTTCA 



180 
240 

300 
360 
420 
480 
540 
60O 
660 
720 
780 
840 
900 
960 
1020 
1080 
114,0 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



60 
120 



60 
120 

180 
240 
300 
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T G TCTGT G Cr GCTGGftTGGA TGGCTAAGGG GAGAGTTQGJ^ TAaCCCMTQ T6AA6GCAGG 
GCCCAACTGT G6ATTT6GAA AAACTGGCAT TATTGATTAT GGAATCOSTC TCAATAGGAO 
TGAAAGATCG QATGCCTATT GCTACAACCC ACACX3CAAAG GAGTGTGGTG GOGTCriTAC 
AGATCCAAAG CAAATTTTTA AATCTCCAGG CTTCCCAAAT GAGTAOGAAQ ATAACCAAAT 
CTGCTACroa CACATTAGAC TCAAGTATGG TCAGCGTATT CACCTGAGTT TTTTAGATTT 
TGACCTTGAA 6AIGA0CCAG GTSGCTT&X TGATTATGTT GAAATATATG ACAGTTAOGA 
TGAT6T0CAT GGCTTTGTG6 GAAGATACT6 TGGAGAT6A8 CTTOCAGATQ ACATCATCA6 
TACAGGAAAT GTCATGACCT TGAAGTTTCT AAGTGATGCT TCAGTGACAG CTGGAGGTTT 
CCAAAICAAA TATGTTGCAA TGGATCCTGT ATCCAAATCC AGTGAAG6AA AAAATACAAG 
tACTACrrCT ACT6GAAATA AAAACTTTTT AGCTG6AAGA TTTAGCCACT TATAAAAAAA 
AAAAAAAGGA TGATCAAAAC ACACAGTSTT TATGTTGGAA TCTTTTG6AA CTOCTTTGAT 
CTCACTGTTA TTATTAACAT TTATTTATTA TTTTTCTAAA TGTGAAAGCA ATACATAATT 
TAGGGAAAAT TGGAAAATAT AGGAAACTTT AAAOSAGAAA AT6AAACCTC TCATAATOCC 
ACTGCATAGA AATAACAAGC GTTAACATTT TCATATTTTT TTCTTTCAGT CATTTTTCTA 
TTTOTGGTAT ATGTATATAT GTACCTATAT GTATTTGCAT TTGAAATTTT GGAATOCTGC 
TCTATGTACA GTTTTGTATT ATACTTTTTA AATCTTGAAC TTTATAAACA TTTTCTGAAA 
TCATTGATTA TTCTACAAAA ACATGATTTT AAACA6CTGT AAAATATTCT ATGA7ATGAA 
TGTTTTATXIiC ATTATTTAAG CCTGTCTCTA TT6TTGGAAT TTCAGGTCAT TTTCATAAAT 
ATTGTTQCRA TAAATATCCT 7QAACACACA AAAAAAAAAA AA 

Seq ID NO: 505 Protein sequence 
Protein Accession ft: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

NIILIYLFLL LWEDTQGWGF KDGIFEMSIW LERAASVYHR EARSGKVXLT YAEAKAVCEF 
BGGHLATYKQ LEAARKIGFH VCAAGNKAKG RVGYPIVKPG PNCGFGKTG.I IDYGHUUIRS 
ERHDAYCYNP HAKE0G6VFT DPKQZFK8PG FFNEYEDNQZ CyHHIRLKYG QRIHLSFU)? ' 
DLBDDPGCLA DYVBIYDSYD DVR6FVGRYC GDEliFODIZS TGNVMTLKFL SDASVTA06F 
QIKYVAMDPV SKSSQGKNTS TTSTGNKNFTj AGRPSHL 

Seq ID NO: 506 DMA sequence 

Ifucleic Acid Accession ft: NM_00711S.l 

Coding sequence .* 69., 902 



PCTAJS02/12476 



1 
I 

GAATTOGCAC 
CTGAC6ATAT 
GGGGATTCAA 
ACCACAGAGA 
GTGAATTTGA 
GATTTCATSr 
AGCCAGG60C 
ATAGGAGTGA 
TCTTTACAGA 
ACCAAATCT6 
XAGATTTT6A 
GTTA0QAT6A 
TCATCAGTAC 
GAGGTTTCCA 
ATACAAGTAC 
AAAAAAAAAA 
TTGATCTCAC 
TAATTTAGGG 
ATCCCACTGC 
TTGTATTTGT 
.OCTGCTCTAT 
TGAAATCATT 
ATGAATGTTT 
TAAATATTGT 



11 
I 

TGCTCTGACSA 
GATCATCTTA 
G6ATGGAATT 
AGCACGGTCT 
AGGOGGCCAT 
CTOTQCTOCT 
GAACTGATGA 
AAGATGGGAT 
TCCAAAGC6A 
CTACTGGCAC 
OCTTQAAGAT 
TGTCCAT6GC 
AGGAAATGTC 
AATCAAATAT 
TACTTCTACT 
AAQ6ATGATC 
TOTTATTATT 
AAAATTGGAA 
ATAGAAATAA 
GGTATATGTA 
GTACAGTTTT 
GATTATTCTA 
TATGCATTAT 
T6CAATAAAT 



21 

I 

ATTTGT6AGC 
ATTTACTTAT 
TTTCATAACT 
GGCAAATACA 
CTOOCAACTT 
OGATGOATGG 
TTTG6AAAAA 
GCCTATTGCT 
ArmTAAAT 
ATTAGACTCA 
GACCCAQGTT 
TTTGTGGGAA 
AT6ACCTTGA 
GTTGCAATGG 
GGAAATAAAA 
AAAACACACA 
AACATTTATT 
AATATAGGAA 
CAAGCGTTAA 
TATATGTACC 
6TATTATACT 
CAAAAACATG 
TTAAGOCTGT 
ATCCTTCX3GA 



31 
I. 

AGCCCCTAAC 
TTCTCTTGCT 
CCATATG6CT 
AGCTCACCTA 
ACAA6CAGCT 
CTAA668CAG 
CTGGCATTAT 
ACAAOCCACA 
CTCCAGGCTT 
AGXATGGTCA 
OCTTQGCTGA 
GATACTGTGG 
AGTTTCTAAG 
ATCCTGTATC 
ACTTTTTAGC 
aTGTTTATGT 
TATTATTTTT 
ACTTTAAAOG 
CATTTTCATA 
TATATGTATT 
TTTTAAATCT 
ATTTTAAACA 
CTCXATTGTT 
ATTC 



41 

I 

AGGCT6TTAC 
ATGGGAAGAC 
TGAACGAGCA 
CX^CAQAAGCT 
AQAG6CAQCC 
AlSTTGGATAC 
T6ATTATGGA 
CGCAAAGGAG 
CCCAAAT6AG 
GOSTATTCAC 
TTATGTTGAA 
A6AIGAGCTT 
TGATGCTTCA 
CAAATCCAGT 
TGGAAGATTT 
TOGAATCTTT 
CTAAATGTGA 
AGAAAATGAA 
TTTTTTTCTT 
TGCATTTGAA 
TGAACTTTAT 
QCTGTAAAAT 
OGAATZTCAG 



51 
I 

TTCACTACAA 
ACTCAA6GAT 
GC0G6TGTGT 
AAGGOGGTGT 
AGAAAAATT6 
CCCATTQTGA 
ATCOSTCTCA 
TGTGGTGG06 
TACGAASATA 
CTGAGTTTTT 
ATATATQACA 
CXAGATGACA 
GTGACAGCTG 
CAAGGAAAAA 
AGCCACTTAT 
TGGAACTOCT 
AAGAAA7ACA 
AOCTCTCATA 
TCAGTCATTT 
ATTTTGGAAT 
6AACATTTTC 
ATTGTATOAT 
GTCATTTTCA 



Seq ZD NOt 507 Protein sequence 
Protein Accession ftt NP_009046.1 

1 11 21 31 41 51 

I I I I t I 

NZILIVLFLL LHBDXQGWGF KDOZFHNSZW I£RAAGVYBR BARSGRYKLT YASAXAVCBP 
EGQHLATYRQ Z£AMUCZGPH VCAAGNMAK6 RVGYPIVKPG SMXXFGRTGZ IDYGZRLNRS 
ERWDAYCYNP HAXBOGOVFT DPKRIPR5PG FPNEYECNQZ CYHaZRUCYO QRZKLSFLOP 
DLEDDPGOiA DYVEIYDSYD DVBGFVCSiyC GDELPDDZZ8 TGNVMTLKFL SDASVTAGGP 
QIKYVAMDPV SKSSQGKI7TS TTSTGNKKFL AGRPSHL 



Seq ID KO: 508 DNA sequence 

SUcleic Acid Accession ft: ]9M_001044.1 

Coding sequence : 12 9 .. 1 991 



1 
I 

ACC6CTC0GG 
AAA6CCCAGG 
GTGTGCCCAT 
CTAAGGAGCC 
AOSSAGTGCA 
AGGATOGGGA 
TGGACCTGGC 



11 
I 

AGOGGGAGGG 
OC00GGC6GC 
GAGTAAGA6C 
CAATGCOGTG 
GCTCACCAGC 
GACCIGGGQC 
CAAGGTCIGO 



21 
I 

GAGGCTTOSC 
CAGACCAAGA 
AAATGCTCCG 
G6CCC6AAGG 
TCCACCCTCA 
AAGAAGATCS 
CGQTTCCCCT 



31 

I 

GGAACSCTCT 
GGGAAGAAGC 
TGGGACTCAT 
AGGTGGAGCT 
CCAACC0806 
ACTTTCrOCT 
ACCTGIGCTA 



41 

I 

OGGCGCCAGG 
ACAGAATTCC 
GTCTTCOGTG 
CATCCTT6TC 
GCA0A6CCOC 
QTCOGZO^TT 
CAAAAATGGT 



51 

I 

ACTCGCGTGC 
TCAACTCCCA 
GTGGCCC06G 
AAGGA6CAGA 
GTGGAG6CCC 
GGCTXTGCIG 
GGC6GTGCCT 



360 
420 

480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 



60 
120 
160 
240 



60 
120 
180 
240 
300 
360 
420 
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TCCPS G T C CC CTACCT6CTC TTCATSGTCA TTGCXGGSAT GOCACrmC lAOlIGGACC 460 

TGGCCCT0G6 CCAGTTCAAC AGGGAAGG6G OO GC TGGTGT CTGGAAGAIC TGCXTCATAC 540 

TGAAAGGTGT GGGCTTCACG GTCATCCTCA TCTCACTGTA TGTOGGCTTC TTCTACAACG 600 

TCATCATCGC CTGGGCGCTG CACTATCTCT TCTCCTCCTT CACCAOGGAG CTCCCCTGGA 660 

TCCACTGCAA CAACTCCTGG AACAGOCCCA ACTGCTOGGA TGCCCATCCT GGTGACTCCA 720 

GTG6AGACAG CTOC3QGCCTC AAOGACACTT TIGGGAOCAC AOCTGCTGOC 6AGTACTTT6 760 

AACGTG G CXST 6CT6CACCTC CACCAGAGGC ATG6CATGGA CGACCTGGGG CCTCOGCGGT 840 

GGCAGCTCAC AGCCTOCCTG GTGCTGGTCA TC G TGCTGCT CTACTTCAGC CTCTGGAAGG 900 

GCX3TGAAGAC CTCAGGGAAG GTGGTATGGA TCACAGCCAC CATCCO^TAC GTGGTCCTCA 960 

CTGCCCTGCT CCTGOGTGGG GTCACCCTGC CTGGAGCCAT AGAOGGCATG AQAGCATACC 1020 

TGAG08T7GA CTTCTACGGG CTCIGC3GAGG CtflVlV rm GATTGA0600 60CACGCAGG 1060 

TGTGCTTCTC GCTG66GGT6 G GG TTOQGGG TGCT6AT06C CTTCTCCAGC TACAACAAGT 1140 

TCACCAACAA CTGCTACAGG GACGCGATTG TCACCACCTC CATCAACTCC CTGACXSAGCT 1200 

TCTCCTCCGG CTTCGTCGTC TTCTCCTTCC TGGGGTACAT GGCACAGAAG CACAGTGTGC 1260 

CCATOGGGGA 0GT6GCCAA6 GACX3GGCCAG GGCTGATCrT CATCA7CTAC C0GGAA6CXA 1320 

T06CCACGCT CCCTCTGTGC TCAGCCTGGG OJ U 'lWlt m ' CTTGATCATG CIQCTCACCC 1380 

TGGGTATOGA CAGCX3CCAT6 GGTGGTTITGG AGTCAGTGAT CACGGG6CTC ATOSATGAGT 1440 

TCCAGCTGCT GCACAGACAC OGTGAGCTCT TCAOGCTCTT CATOGTCCTG GCX3ACCTTCC 1500 

TCCTQTCCCT GTrCTGCXTTC ACCAACGGTG GCATCTACGT CTTCAOGCTC CTGGACCATT 1560 

TTGCAGCCG6 CAC6TCCATC CTCTTTGGAG TGCTCATGGA AGCCATOSGA GTGGCCTGGT 1620 

TCTATGG1GT T0G6CAGTTC AGCX3A0GACA TOCAGCAGAT GACOGGGCAO OOGCCCAiSCC 1680 

T6TACTGG06 GCTGTGCTGG AAGCTQCTCA GCCCCTGCTT TCTCCTGTTC GTOGTOGTCG 1740 

TCAGCATTGT GACCTTCAGA CCCCCXXACT ACGGAGCCTA CATCTTCOOC GACTGGGCC3V 1800 

ACGCGCTCGQ CTGGGTCATC GCCACATCCT CCATGGCCAT GGTGCCCATC TATGCGGCCT 1860 

ACAAGTTCTO CAGCCTGCCT GGGTCCTTTC GAGAGAAACT GGCCTAOGOC ATTGCACCX33 1920 

AGAAG6ACCX3 tGAGCTGGTG GACAGAGGGG AGGTGOGCCA 6TTCA06CTC 06CCACTG6C 1980 

TCAAGGTGTA GAGGGAGCAG A6AGGAAGAC CCXAGGAAGT CATCCT6CAA TGGGAGAGAC 2040 

ACGAACAAAC CAAGGAAATC TAAGTTTOSA GAGAAA6GA6 GGCAACTTCT ACTCTTCAAC' 2100 

CTCTACTGAA AACACAAACA ACAAAGCAQA AGACTCCTCT CTTCTGACPG TTTACACXTTT 2160 

TCCOTOCOGG GAOCGCACCT CGCCGTGTCT TGTGTTGCTG TAATAAOGAC GTAGATCTGT 2220 

GCAGCX3AGGT CCACCCCGTT GTTGTCCCTG CAGGGCAGAA AAA06TCTAA CTTCATGCTG 2280 

TCTGTGTGAG GCTCCCTCOC TCCCTGCTCC CTGCTCCOGG CTCTGAGGCT GCCCCAGGGG 2340 

CACTGTGTTC TCAGGCGGGG ATCACX5ATCC TTGTAGACGC ACCTGCTCTG AATCCCCGTG 2400 

CTCACA6TA6 CTTCCTAGAC CATTTACTTT GCCCATATTA AAAA6CCAAG TCTCCTGCTT 2460 

GGTTTAGCTG TGCAGAAQGT GAAATGGAOG AAACCACAAA TTCATGCAAA GTCCTTTCCC 2520 

GATGCGTGGC TCCCAGCAGA GGCCGTAAAT TGAGOGTTCA GTTGACACAT TGCACACACA 2580 

GTCTGTTCA6 AGGCATTGGA GGATGGGGGT CCTGGTAT6T CTCACCAGGA AATTCTGTTT 2640 

AT6TTCTTGC AGCA6AGAGA AATAAAACTC CTTQAAACCA GCTCAQGCTA CTGCCACTCA 2700 

GGCAGCCT6T G6GTCCTTGT G6TGTAGG6A AOGGOCTSAO AGGASOgT GT CCTATCCCCG 2760 

GACGCATGCA GGGCCCXXrAC AGGAGCGTGT CCTATCCCOG 6A0GCATGCA GGGCCCCCAC 2820 

AGGAGCATGT CCTATCCCTG GACGCATGCA GGGCCCCCAC AGGAGCGTGT ACTACCCCAG 2880 

AACGCATGCA GGGCCCCCAC AGGAGCGTGT ACTACCCCAG GACGCATGCA GGGCCCCCAC 2940 

TGGAGC8TQT ACTACCCCAG 6A0SCATGCA GGGCCCCC A C AOGAGCBTOT CCTATCCCCG 3000 

6AC0GGACGC ATGCAGG6CC CCCACAGGAG CGTGTACTAC CCCAOQACGC AT6CAGG6CC 3060 

CCCACAGGAG CGTGTACTAC CCCAGGATGC ATGCAGGGCC CCCACAGGAG CGTGTACTAC 3120 

CCCAGGAOGC ATGCAGGGCC CCCATGCAGO CAGCCTGCAG ACCAACACTC TGCCTGGCCT 3180 

TGAGCCGTGA CCTCCAGGAA GGGACCCCAC TGGAATTTTA TTTCT C TCAG GTGCGT6CCA 3240 

CATCAATAAC AACAGTTTTT ATGTTTGOGA AT6GCTTTTT AAAATCATAT TTACCTGTGA 3300 

ATCAAAACAA ATTCAAOAAT GCAGTATCG6 CQAGCCTGCT T8CT8ATATT QCAGTTTTTG 3360 

TTTACAAGAA TAATTAGCAA TACTGAGTGA AGGATGTTGG CCAAAAGCTG CTTTCCATGG 3420 

CACACTQCCC TCTGCCACTG ACAGGAAAGT GGATGCCATA GTTTGAATTC ATGCCTCAAG 3480 

TCGGTGGGCC TGCCTACGTG CTGCC06AGG GCAG6G6C0G TGC AGGG CCA 6TCATGGCTG 3540 

TCCCCTGCAA GTGGAC6TG6 GCTCCAOGGA CTGGAGTGTA ATGCT08GT0 GQA6C0STCA 3600 

GCCT6TGAAC TGCCA66CAG CTGCAGTTAG CACAGAGQAT GQCTTOOOCA TTOCCTTCTG 3660 

GGGAGGQACA CAGAGGAOGG CTTCCOCATC OCCTTCTGGC OGCTGCASTC AGCACAGAGA 3720 

GCGGCTTCCC CATTGCCTTC TGGGGAGQQA CACAGAGGAC AGTTTCCOCA TOGCCTTCTG 3780 

GTTGTTGAAO ACAGCACAGA GAGCQOCTTC CCOITCGCCT TCTGGGGAGG GGCTCCGTGT 3840 

AOCAAOCCAG GTGTTGTCCG TGTCTGTTGA CCAATCTCTA TTCAGCATC6 TGTG6GTCCC 3900 
TAAGCACAAT AAAA6ACATC CACAATGGAA AAAAAAAAAG GAATTC 

Seq ID KOi 509 Protein eequence 
Protein Accession Us NP_001035.1 

1 11 21 31 41 51 

] t ) ] I I 

MSRSKCSVGL MSSWAPAKE PKAV6FKEVG LILVKBQNGV QLTSSTLTNP RQSFVEAQDR 60 

ETW6KRZDFL LSVIGFAVDI4 ANVHRF?YZiC YRNGGQAFLV PYLLFMVZAG KPLFYMELAL 120 

GQF19KBGAAG VNXICPZIiKO VGFTVZZJSL yVGFTVNVXZ AKALHYLFSS FTTELPWIKC 180 

HNSNNSPIICS DARFGDSSGD SSGIMDTFGT TPAABYFERG VLBLBQSHGX DDLGPPRWQL 240 

TACLVLVIVL LYFSLWKGVK TSGKWHITA TMPyWLTAL LLRGVTLPGA IDGIRAYLSV 300 

DFYRLCEASV HIDAATQVCP SbGVGPGVLI AFSSYNKFTO NCYRDAIVTT SINSLTSPSS 360 

GFWFSPLGY KAQKH5VPZG Z7VAKIX3PGLI FIZYPSAIAT I/PZiSSAKAW FFIMLLTLGZ 420 

DSAMGGHESV ZTQX.IDEFQL LHSHRBLFTL PIVLATFU.S LPCVrNGGIY VFTLLDHFAA 480 

GTSILFOVLI EAIGVAWFYQ VGQFSDDIQQ MIGORPSLyw RLCHXLVSPC FLLFWWSI 540 

VrPRPPBYOA YIPFDWAKAX. GWVZATSStlA MVPZYAAYKF CSLP6SFSEK lAYAZAPEKD 600 
RELVDRCTVR QFTLREtOXV 

Seq ZD NO: 510 SNA sequence 

Nucleic Acid Accession «: HM.001216.1 

Coding sequence: 4 3.. 14 22 ~ 

1 11 21 31 41 51 

I 1 i I I I 

6CCCGTACAC ACCGTGT6CT 6GGACACCCC ACA6TCAGCC GCAT6GCTCC CCTGTGCCCC 60 

AGCCCCTGGC TCCCTCTGTT GATCCOGGCC CCTGCTCCAO GCCTCACTGT GCAACTGCTG 120 

CTGTCACTGC TGCTTCTGAT GCCTGTCCAT CCCCAGAGGT TGCCCOGGAT GCAGGAGGAT 180 

TCCCCCTTGG GAGGAG6CTC TTCT6G66AA 6ATGACCCAC TGGG0GA6GA GGATCTGCOC 240 



379 



wo 02/086443 

AGTGAAGAOG ATTO10C3CAG AQA6GA6QAT CCAO00GC3AG AG6AGGATCT ACCTGGAGAG 300 

GAGGATCTAC CT6GAGAGGA GGAICTAOCT GAAGmAGC CTAAATCAGA AGAAGAGGGC 360 

TCCCTGAAGT TAGAGGATCT ACCTACTOTT GAGGCTOCIG GftGATCCTCA AGAACCCCAG 420 

AATAAIGCCC ACAGGGACAA AGAAGGGGAT GACCAGAGTC ATTGGCGCTA TGGAGGOGAC 480 

CC36CCCTGGC CCOGGGTOTC CCCA6CCTGC GOGGGC08CT TCCW3TCCCC GGT6GATATC 540 

OGOOOCCAGC TOGCOGOCTT CTGCOOGCOC cro08CCOC3C TOGAACTCCT GGGCITCCAG 600 

CTOCOGCCX;C TCCCAGAACT G060CTGG6C AACAATGGCC ACA6TGTGCA ACTGAOCCTG 660 

CCrCCTGGGC TAGAGATOGC TCTGGGTCCC GGGOGGGAGT ACCGG6CTCT GCAGCTGCAT 720 

CPGCACTGGG GGGCTGCAGG TCGTCCGGGC TOGGAGCACA CTGTG GAAO G CCACCG TTTC 780 

CCT800GAGA TCCACGTGGT TCACCTCAGC ACOGCCTTTG CCAGAGTTGA CX5AGGCCTTG 840 

QQGOGOOCGG GAGGCCTO G C CGTCTT6G0C GCCTTTGTGG AGOAGGGCGC OGAAGAAAAC 900 

AGTSCCTATG AGCAGTT6CT GTCTOQCTTG GAAGAAATOG CTGAGGAAGO CTCAGASACT 960 

CAGGTOCCAG GACTGGACAT ATCTGCACTC ClXjOXTCrG ACTTCAGCOG CTAC TTCCAA 1020 

TATGAGGGGT CTCTGACTAC ACOGOCCTGT GCCCAGGGTG TCATCTGGAC TGTGTTTAAC 1080 

CAGACAQTGA TGCTGAGT6C TAAGCAGCTC CACACCCTCT CTGACACCCT QTGGGGACCT 1140 

GGTGACTCTC OGCTACAGCT GAACTTCCGA GOGAOGCAOC CTTTGAATQO GCGAQTGATT 1200 

GAOGCCTCCT TCCCT G CPGG AGTGGACAGC AOTCCTOOGG CTGCTGAGCC AGTCCAGCTG 1260 

AATTCCTGCC TGGCTGCTGG TGACATCCTA GCCCTGGTTT TTGGCX:TCCT TTTTGCTGTC 1320 

ACXAGOGTOS CGTTCCTTGT GCAOATGAGA AGGCAGCACA GAAGGGGAAC CAAAGGGGGT 1380 

GTGAGCTACC GCCCAGCAGA GGTAGCCGAG ACTGGAGCXT AGAGGCTGGA TCTTGGAGAA 1440 

TGTGAGAAGC CAGCCAGAOG CATCTGAGGG GGAGCOGGTA ACTGTOCTOT CCTGCTCATT 1500 

atgccacttc cttttaactg ccaagaaatt ttttaaaata aatatttata at 

Seq ID NO: 511 Protein sequence 
Protein Accession 8: NP_001207.1 

1 11 21 31 41 51 

r r I • I 1 ' I 

MAPLCPSPWL PLLIPAPAPG LTVQLLLSLIj LLMPVHPQRL PRMQEDSPLG GGSSGEDDPL 60 

GEEDLPSEBD SPREEDPPGE BDLPGEEDLP GEEDLPBVKP KSEEEGSLKL EDIiPTVEAPG 120 

DPQBPQNNAH RDKBGDDQSH WRYGGDPPWP RVSPACAGRF QSPVDIRPQli AAPCPALRPli ISO 

EL]j6F0Z<PPL PBLBLR13NGH SVQLTLPPGL E34ALGPGREY RALQ T iHTi HW G AAGRPGSEHT 240 

VB6ESFPABX KWHX*STAFA PVDEALGEPG GLAVLAAFLB BGPEENSAYB QLIiSRLEEIA 300 

EBGSBTQVPG LDXSALLPSD FSRYFQYBGS LTTPPCAQGV IWTVPNQTVM LSAKQIiHTLS 360 

DTLHGFGDSR LQUIFRATQP UJGRVIBASP PAGVDS5PRA AEPVQLHSCL AAGDILALVF 420 
6LLFAVTSVA FLVQMRRQHR R6TKG6VSYR PABVABItsA 

Seq ID MOi 512 laXA sequence 

jTudeic Add Accession Si Bos sequence 

coding sequence: 1..3978 

1 11 21 31 41 51 

11)111 

AT66TGGGTG AAGGAOOCTA CCTTATCTCA GATCIG6ACC AGC6AGGC06 G0GGA6ATCC 60 

TTTGCAGAAA GATATGACCC CAGCCTGAAG ACCATGATCC CAGTGCGACC CTGTGCAAGG 120 

TTAGCACCCA ACCCGGTGGA TGATGCCX3GG CTACTCTCCT TCGCCACATT TTCCTGGCTC 180 

ACGC06GTGA TGGTGAAAGG CTAOOGGCAA AGOCTGACOS TAGA CACCCT GCCOCCATTG 240 

TOOACATATO ACTCATCTGA CAOCAATGCC AAAAGATTTC GAQTCCTTTG GGATGAAGA6 300 

GTA6CAA0GG TGGGTCCTGA GAAGGCCTCT CTGAGCCAOG TGGTGTGGAA ATTCCAGAGG 360 

ACAOGCGTGT TGATGGACAT CGTGGCCAAC ATCCTOTGCA TCATCATGGC AGCCATAGGG 420 

CCGACAGTTC TCATTCACCA AATCCTCCAG CAGACTGAGA GGACCTCTGG GAAAGTCTGG 480 

GTTG6CATTG GACTGTOCAT AGCOCTTTTT 6CCA00GAGT TTACGAAAGT CrrCTTTTGG 540 

GCOCTTGOCT QGGCCATCAA CTAOOSCAOQ GCCATCOGOT TQAAGQTGGC GCTCTCCACC 600 

rmm ' iU ' X ' m aaaacctagt gtocttcaag acattqaccc acatctctot tggogaggtg 66 o 

CTCAATATAC TCTCAAGTGA TAGCTATTCT TTGTTTGAAG CTGCCTTGTT TTGTCCTTTG 720 

CCACCCACCA TCCOGATCCT AATOGTCTTT TGTGCGGC3GT AOGCCTTTTT CATTCTGGGG 780 

CCCACA6CTC TCATOQOOAT ATCAGTGTAT GTCATATTCA TACCCGTCCA GATGTTTATG 840 

GOCAAGCTCA ATTCAOCTTT OOQAAGGTCA GCAATTT7GG TGACAGACAA GCGAGTTCAO 900 

ACaUlTGAATG AGTTTCTGAC CTGCATCAGG CTGATCAAAA TGTATOOCTO GGAGAAATCT 960 

TTTACCAACA CTATCCAAGA TATAAGAAGO AGOGAAAGAA AATTACTGGA AAAAGCTGGA 1020 

TTTCTCCAAA GTGGAAACTC TGCXXTTGCCC CCCATCX3T6T CCACX3^TAGC CATCGTGCTG 1080 

ACATTATCCT OCCACATCCT CCTGAGAC3GC AAACTCACCG CACCOGTGGC ATTTAGTGTG 1140 

ATTGCCATGT TTAATOTAAT (»AGTTTTOC ATTGCAATCT TGCCXTTTCTC CATCAAAGCA 1200 

ATGGCTGAAG OGAATGTCTC TCTAAGGAGA ATGAAGAAAA TTCTCATAGA TAAAAGCCOC 1260 

CCATCTTACA TCACCCAACC AGAAGACCCA GATACTGTCT TGCTTTTAGC AAATGCCACC 1320 

TTGACATGGG AGCATGAAGC CAGCAGGAAA AQTACCCCAA AGAAATT6CA GAACCAGAAA 1380 

AGGCATTTAT GCAAGAAACA GA66TCAGAG 0CATACA6TG AGAOGAGTCC ACCA60CAAG 1440 

GGAGCCACT6 6CCCAGAG6A OCAAAGTGAC AGCCTCAAAT 0GGTTCT6CA GAGCATAAGC 1500 

TTTCTGGTGA OAAAGTTATG TOGTTATCCC GAAGCOCAGC TCCTG GCTTO GAOGTGOCCA 1560 

0CACT6TTTG TTGG G A6AAT CATCAGA6GA TACAGGCCTC ATQGATTTTC TGCTAAAGAC 1620 

AAQGATGAAT CTAGAAGGCT TCTTACTTGQ CCCCAAGAAG TGGATAGGAC TCAAAGGGCA 1680 

GCCAAATACX: TGGG6AAGAT CTTG6GAATA T6TGGGAATG TGGGAAGT6G AAAgGCTCC 1740 

CTCCTTGCA6 CTCT0CTA68 ACAGAXX3CAG CTGCAGAAAO GGGTGGT66C AGTGAATOSA 1800 

ACTTTG6CCT ACX3TTTCACA GCAG6CAT66 ATCTTTOTG GAAATGIGAG AGAAAACATA 1860 

CTCTTTGGAQ AAAAGTATGA TCACCAAAGG TATCAGCACA CAQTCOGCGT CTGTGGCCTC 1920 

CAGAA66ACC T6AGCAACCT CCCCTATGGA GACCTGACTG AGATTGGGGA GOGGGGCCTC 1980 

AACCTCTCTG 6G6GGCA6AG 6CAGAGGATT AGOCTQGCCC 60GCTGTCTA CT006ACGGT 2040 

CA6CTCTAOC TGCTGGAOGA CCCCCTO T OQ GCOGTGGAaS CCCftCGTGGG 6AAGCA0GTC 2100 

TTTGAGGAiST GCATTAA6AA GA06CTCA6G GGAAAGAOUS TG6TCCTG6T GAOOO^CCAG 2160 

CTACAGTTCT TAGAGTCTTG TGATGAAGTT ATTTTATTAO AAQATGGA6A GATTTGTGAA 2220 

AA66GAACCC ACAA6GAGTT AATGGAGGAG AGAGGGOQCT ATGCAAAACT GATTCACAAC 2260 

CTGOGAGGAT T6CAGTTCAA GGATCCTX3AA CAGCTTTACA ATGCAGCAAT GGT06AAGCC 2340 

TTCAAGGAGA GCCCTGCTGA GAGAGAG6AA GATSCPOOTA TAATOGGGTA OCTOCTTTCT 2400 

CTCTTCACTG TGTTCCTCTT CCTCCTQATG ATTGGCAGOG CTOCCTTCAG O^ACTGGTGG 2460 

CTG GG TCTCT GGTTGGACAA GOOCTCAOGG ATGACCTGT6 GGCCCCAGGG CAACAGGACC 2520 

ATGTGTGAGG TOGGCQOGGT GCTGGCAGAC ATCGGTCAGC ATQTGTACXaV GTGGGTGTAC 2580 

ACTGCAAGCA TG6TGTTCAT GCTGGTGTTT GGGGTCACCA AAGGCnUiT CTTCACCAAG 2640 
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ACCACACTGA TG6CATCCTC CTCTCTGCAT QACAOCSGTGT TTGATAACaT CTTAAAGA6C 2700 

CX3UITGAGTT TCTTTGACA.C GACTCOCACT OGCAOGCTAft TGAACOGTTT TTCOUISGAT 2760 

ATGGA0C5AGC TGGATGTGAG GCTGCCX3TTT CAOGCaGAGA ACTTTCTGCA GCAGTTTTrT 2820 

ATGGTGGTGT TTATTCTCGT GATCTTGGCT GCTGTGTTTC CTGCTGTCCT TTTAGTOSTG 2880 

GCX31GCCTTG CTGTA6GCTT CTTCATTCTG TTAOGCATTT TCCACAGAGO AGTCCAGGAG 2940 

CTCMGAAGG TGGASAATGT CAOGGGOTCA OOCTGOTTCA CCCACATCAC CTCCTGCAT6 3000 

CAGGGOCTGG 6CATCATTCA 06CCTATGGC AAGAAGQAGA 6CT6CATCAC CTATACTTCA 3060 

TCCAAAGGCC TGTCATTGTC ATACATC31TC CAGCTGAGCG GACTGCTCCA AGTGTGTGTG 3120 

CQAAOGGGAA CAGAGAOGCA AGCCAAATTC ACCTCCGTGG AGCTGCTCAG GGAATACATT 3180 

TOSACCTGTG TTCCTGAATG CACTCATCCC CTCAAAGTGG G6A0CTGTCC CAAGGACTGG 3240 

CO CAGC l^TO GGGAGATCAC CTTCAGAGAC TATGAOATGA OATACAGAGA CAACAOOCOC 3300 

crrGTTCTGG ACAGCCT6AA CTTGAACATA CAAAOTGGGC A6ACAGT0GG GATT6TTGGA 3360 

AGAACAGGTT COGGAAAGTC ATOGTTAGGA ATGGCTTTGT TTOGTCTGGT GGAGCCAGCC 3420 

AGTGGCACAA TCTTTATTGA TGAGGTGGAT ATCTGCATTC TCRGCTTGGA AGACCTCAGA 3480 

AGCAAGCTGA CTGTGAXCCC ACAGGATCCT GTCCTGTTTG TAGGTACA6T AAGGTACAAC 3540 

TTGGATCCCT TTQAGAGTCA CACCGAT6AG ATGCTCTG3C A6GTTCTGGA GAGAACATTC 3600 

ATGA6AGACA CAATAATGAA ACTCCCA6AA AAATTACAGO CAGAAGTCAC A6AAAATGGA 3660 

GAAAACTTCT CAGTAGGGGA A0GTCAGCT6 CTTTGTGTGG CCOGAGCTCT TCTCOGTAAT 3720 

TCAAAGATCA TTCTCCTTGA TGAAGOCACC GCCTCTATGO ACTCCAAGAC TGACAOCXTTG 3780 

GTTCAGAACA CCATCAAAGA TGCCTTCAAG GGCTGCACTG TGCTGACCAT CXSOOCACOSC 3840 

CrCAACACAG TTCTCAACTG OGATCA06TC CTGGTTATGQ AAAATGGGAA 6GT6ATTGAG 3900 

TTTGACAAGC CTGAAGTCCT TGCAGAGAAG CCAGATTCra GATTT606AT GT1!ACTA6GA 3960 
GCAGAAGTCA GArPGTAG 

Seq ID NOt 513 E^rotein sequence 
Protela Accession 9t Eos sequence 

1 11 * ' 21 ■ 31 41 *51 

i I i I I I 

MVGBGPYLIS DLDQRGRRRS FABRYDPSLK TMIPVRPCAR LAPHPVDDAG LLSPATPSHL 60 

TPVMVK6YRQ RLTVDTliPPL STYDSSDTNA KRPRVLWDEB VARVGPEKAS LSHWWKFQR 120 

TRVLMDIVAN ILCIIMAAIG PTVLIHQILQ QTERTSGKVW VGIGLCIALP ATEFTKVFFW 180 

ALAHAimST AXRUCVAX^ LVFENLVSFK TLTHISVGEV WILSSDSYS LFEAALFCPL 240 

PATIPIU4VP CAAYAPPIL6 PTALIGISVY VIPIPVQMFM AKUISAFRRS AILVTDKRVQ 300 

TMKEFLTCIR LXKMYAWEK5 FTNTIOBIRR RERKLLEKAG FVQSGMSALA PIVSTIAIVL 360 

TLSCHILLRR KLTAPVAPSV lAHFMVMKPS lAXLPFSIKA MAEANVSLRR KKKILIDKSP 420 

PSYITQPEDP DTVLLLANAT LTWEHEASRK STPKKLQNQK RKLCKKQRSE AYSERSPPAK 460 

GATGPEBQSD SZiKSVLHSXS FWRKLCRYP EAQLLAWRWP AVFVGRIIR6 YRPHGFSAKD 540 

XDBSRRLLTW PQBVSRTQRA AKYLGKILGZ COIVGSGKSS LLAALLGQMQ LQKGWAVNG 600 

TLAYVSQQAW IFHGNVRENI LFGEKYDHQR YQHTVRVCX3L QXSLSNLPYG DLTEIGERGL 660 

NLSGGQRQRI SLARAVYSDR QLYUjDDPLS AVDAHVGKHV FEECIKKTLR GKTWLVTHQ 720 

liQFLESCDEV ILLEDGEICB KGTHKELMEE RGRYAKLIKN LRGLQFKDPB HLYNAAMVEA 780 

FKESPAERBB DAOIIGYLLS LPTVFLFLLM IGSAAPSNHH LOLIWKGSR HTOSPQGMRT 640 

KCEVGAVLAD IGQHVYQWVY TASMVFMLVP GVTKGFVPTK TTLMASSSM DTVFDKILKS 900 

PMSFFDTTPT GRLMNRFSKD MDELDVRLPP HAENFLQQFP MWFILVILA AVFPAVLLW 960 

ASIAVGFPIL LRIFHRGVQS LKKVENVSRS PWPTHITSSM QGLGIIHAYG KKBSCITYTS 1020 

SXGLSLSYII QLSGUiQVCV RTGTETQAKF TSVELLRBYl xSTCVPECTHP LKVGTCPKDW 1080 

PS06BXTFBD YOMRYRONTP LVLDSXJR^I QSGQTVGIVO RTGS6KSSLQ NALFRLVEPA 1140 

8GTXFIDEVD XCILSLEDLR TKLTVXPQDP VLFVGTVRYN LDPFBSBTDB NLWQVIiERTF 1200 

MRDTXMKX.FB KLQAEVTENG ENFSVGERQL LCVASALLRN SKXXXiLDEAT ASMDSKXDTL 1260 

VONTXKDAFK 6CTVLTIAHR UnVUNOXEV LVKENGKVIB FDICPBVLABK PDSAFAMLIA 1320 
AEVRL 

Seq ID NO: 514 DNA sequence 
Nucleic Acid Accession «i Z31560 
Coding sequence i 1-966 

1 11 21 31 41 51 

I I I I t I 

C3VCA6C5C3CCX: GCATCTACAA CATGATGGAG ACGGAGCTGA A0CCGCCX3G0 CCX3SCAGCAA 60 

ACTTCGGGGG GCGGOSGCGG CAACTCCACC GOGGCGOCGG CCOGCGGCAA CCA6AAAAAC 120 

AGCCC3GGACC GOGTCAAGCG GCCCATGAAT 0CCTTCAT66 TGTG6TCCCG C6GGCAG06G 180 

GGGAAQATG^ COCAGQAGAA CCCCAAGAT6 CACAACT0G6 AGATCAGCAA GOGCCTGGGC 240 

GC06AGTG6A AACTTTTGTC GGAGAOGGAG AAG06G00GT TCATCGA06A GGCTAAGCGG 300 

CTSOGAQCGC TGCACATGAA GGAGCACCOG GATTATAAAT ACOOGCCCOO G0G6AAAACC 360 

AAGACGCTCA TCAAGAAGGA TAAGTACACG CTGCCCCGCO OGCTGCTGGC CCCCGGCGGC 420 

AATAGCATGG CGAG0QGG6T CGGGGTG6QC GC06GCCTG6 GOGOGGGCGT GAACCAGOGC 480 

ATG6ACAGTT AGGC6CACAT 6AA0G6CT6S AGCAA0G6CA GCTACAGCAT GATGCAGGAC 540 

CAOCTGGOCT ACC08CAGCA C0CX3GGCCTC AAT60GCA0G GGGCAGGGCA GATGCA6GCC 600 

ATGCACCGCT ACGA06TGAG CGCCCTGCAG TACAACTCCA TGACCAGCTC GCAQACCTAC 660 

ATGAAOGGCT CGCCCACCTA CAGCATGTCC TACTC3GCA0C AGOGCACCCC T6GCATGGCT 720 

CTTG6CTCCA TGGGTTCGGT GGTCAAGTCC GA0QCCA6CT CCAOCCCCCC TGT6GTTACC 780 

TCTTCCTCCC ACTCCAOQGC QCCCIGCXM3 6CCGGGGACC TCCG6GACAT GATCAGCATG 840 

TATCTCCCOO G0800QAGGT GOOGGAACCC GC06CCCCCA 6CAGACTTCA CAT6TGCCAG 900 

CACTACCAGA OOGGCCCGGT GCCCGGCAOG GCCATTAAOG GCACACTGCC CCTCTCACAC 960 

ATGTGAGG6C OQGACAGOGA ACTGGAGGGG GGA6AAATTT TGAAAGAAAA ACGAGGGAAA 1020 

TGGGAGGGGT 6CAAAAGAGG AGAGTAAGAA ACAGCATGGA GAAAACCCGG TAOSCTCAAA 1080 



Seq ID NOi 515 Protein sequence 
Protein Accession «: CAAB3435 

1 11 21 31 41 51 

I 1 I I I I 

aSARMYHMMB TELKPPGPQQ TSOGGGGNST AAAA6GNQKN SPDRVKRPKN AFMVH5RGQR 60 

REMAQEI7PKM HNSEISKRLG AEWKLLSHTE KRPFXDBAKR LRALHMKBHP DYKYRPRRKT 120 

KTIJ4KRDICYT UPG6LLAPG6 NSMA8GVGVG A6LGA6VNQR MDSYAHKI36W SNGSYSKMQD 180 
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QLGypQHFGIi KABGAHQMQP MHRYCVSALQ yNSMTSSOTY KNGSFTYSMS YSQQGTFGMA 240 

LGSMGSWKS EASSSFEWT SSSHSHAPCQ AGDUtOMISM YLF6ABVPEP AAFSSLBMSQ 300 
HYQSGPVPGT AINGTLPLSH M 

Seq ID NO: 516 DNA sequence 
Nucleic Acid Accession 8: U9161B 
ceding sequence: 29.. 541 

1 11 21 31 41 51 

I ) I I I t 

06QACTT66C TTGTTAGAAG 6CR3AAAGAT GATGGCAGGA ATGAAAATCX! A6CTTGTAT6 60 

CAT6CTACTC CTGGCTTTCA GCTCCTGGAG TCTGTG C TCA GATTCAGAAG AG(»AATGAA 120 

AGCATTAGAA GCAGATTTCT TGACCAATAT GCATACATCA AAGATTAGTA AAGCACATGT 180 

TCCCTCTTGG AAGATGACTC TGCTAAATGT TTGC3«5TCTT GTAAATAATT TGAACAGCCC 240 

AGCTGAGGAA ACAGGAGAAQ TTCATGAAGA GGAGCT7GTT GCAAGAAGGA AACTTCCTAC 300 

TGCTTTAGAT GGCTTTAGCT TGGAAGCAAT GTTGACAATA TACCAGCTCC ACAAAATCTG 360 

TCACAGCAjGG GCTTTTCAAC ACTGGGAGTT AATCCAGGAA GATATTCTTG ATACTGGAAA 420 

TGACAAAAAT GGAAAGGAAG AAGTCATAAA GAGAAAAATT CCTTATATTC TGAAACGGGA 480 

GCTGTATGAG AATAAACCCA GAAGACCCTA CATACTCAAA AGAGATTCTF ACTATTACTG 540 

AGAGAATAAA TCATTTATTT ACATGTGATT GTOATTCATC ATCCCTTAAT TAAATATCAA 600 

ATTATATTTQ TCTGAAAATXS T6ACAAACAC ACTTATCTGT CTCTTCTACA ATTGTGGTTT 660 

ATTGAATGTG TTITrCTGCA CTAATAGAAA TTAGACTAAG TGTTTTCAAA TAAATCTAAA 720 
TCTTCAAAAA AAAAAAAAAA AAATGGGGCC GCAATT 

Seq 10 NO: 517 Protein sequence 
Protein Accession tft AAB50564 

I 11 21 31* 41 si ' 

I I I I 1 I 

HMAGMKIQLV CKLLLAFSSH SLCSDSEEEM KALEADFLTN MHTSKISKAB VPSifKMTLIiN 60 

VCSLVNNLNS PAEETGEVHE EELVARRKLP TALDGFSLEA MLTIYQLHKI GBSRAFQHWB 120 
LIQEDILDTG NDKNGKBEVI KRKIPYILKR QLYOnCPRRP YZLRRDSYVY 

Seq ZD N0« SIB DNA sequence 

Nucleic Acid Accession tt NM_006536.2 

'Coding sequence: 109.. 2940 

1 11 21 31 41 51 

I I I I I I 

ACCTAAAACC TTGCAAGTTC AGGAAGAAAC CATCTCCATC CATATTGAAA ACCTGACACA 60 

ATGTATGCAG CAGGCTCAGT GTGAGTGAAC TGGAGGCTTC TCTACAACAT QACCCAAAOO 120 

AGCATTGCAG GTCCTATTTG CAACCTGAAG TTTGTGACTC TCCTGGTT6C CTTAA6TTCA 180 

GAACTCCCAT TCCTGG6A0C TGGAGTACAO CTTCAAOACA ATG6GTATAA TGGATT6CTC 240 

ATTGCAATTA ATCCTCAOGT AGCTGAGAAT CAGAAOCTCA TCTCAAACAT TAAGGAAATG 300 

ATAACTGAAG CTTCATTTTA CCTATTTAAT GCTACCAAGA GAAGAGTATT TTTCAGAAAT 360 

ATAAAGATTT TAATACCTGC CACATGGAAA GCTAATAATA ACAGCAAAAT AAAACAAGAA 420 

TCATATGAAA AGGCAAATGT CATAGTGACT GACTGGTATG GGGCACATGG AGATGATCCA 480 

TACACCCTAC AATACAGAGQ GTGTGGAAAA GAGGOAAAAT ACATTCATTT CACACCTAAT 540 

TTCCTACTGA ATGATAACTT AACAGCTG6C TA0G6ATCAC GA06C0GA6T 6TTT6TCCAT 600 

GAATGGGCCC ACCTCC6TTG GGGTGTGTTC GATGAGTATA ACAATGACAA ACCTTTCTAC 660 

ATAAATGGGC AAAATCAAAT TAAAGTGACA AGGTGTTCAT CTGACATCAC AGGCATTTTT 720 

GTGT6TGAAA AAOGTCCTTO CCCCCAAGAA AACTQTATTA TTAGTAAGCT rTTTAAAGAA 780 

GGAT6CACCT TTATCTACAA TAGCACCCAA AATQCAACT6 CATCAATAAT GTTCAT6CAA 840 

ASTTTATCTT CTGTGGTTGA A T TTTGTAAT 6CAAGTACCC ACAACCAA6A AGCACCAAAC 900 

CTACAGAACC AGATGTGCAG CCTCAGAAGT GCATGGGATG TAATCACAGA CTCTGCTGAC 960 

TTTCACCACA GCTTTCCCAT GAATGGGACT GAGCTTCCAC CTCCTCCCAC ATTCTCGCTT 1020 

GTACAGGCTG GTGACAAAGT GGTCTGTTTA GTGCTGGATG TGTCCAGCAA GATGGCA6AG 1080 

GCTGACAGAC TCCTTCAACT ACAACAAOCC GCAOAATTTT ATTTGATGCA GATTGTTGAA 1140 

ATTCATACCT T06TGG6CAT 7GCCAGTTTC GACAGCAAAG 6AGAGATCAG AGCCCAGCTA 1200 

CACCAAATTA ACAOCAATGA TGATOGAAAG ■ n X i C'l W m' CATATCTGCC CACCACTGTA 1260 

TCAGCTAAAA CAGACATCAG CATTTGTTCA GGGCTTAAGA AAGGATTTGA GGTGGTTGAA 1320 

AAACTGAATG GAAAAGCTTA TGGCTCTGTG ATGATATTAG TGACCAGGQG AGAT6ATAAG 1380 

CTTCTTG6CA ATTGCTTAOC CACTGTGCTC AGGACTGOTT CAACAATTCA CTCCATT8CC 1440 

CTGGGTTCAT CT6CAGCCCC AAATCTGGA6 GAATTATCAC GTCTTACAGQ AG6TTTAAAG 1500 

TTCTTTGTTC CAGATATATC AAACTCCAAT AGCATGATTG ATGCTTTCAO TAGAATTTCC 1560 

TCTQGAACTO GAGACATTTT CCAGCAACAT ATTCAGCTTG AAAGTACAGG TGAAAATGTC 1620 

AAACCTCACC ATCAATTGAA AAACACAGTG ACTGTGGATA ATACIGTGGG CAAGGACACT 1680 

ATGTTTCTAG TTAOGTGGCA GGOCAGTGGT CCTCCTGAGA TTAXATTATT TQATCCTOAT 1740 

GGACGAAAAT ACTACACAAA TAATTTTATC AGCAATCTAA CTTTT08GAC AGCIAGTCTT 1600 

TGGATTCCAG GAACAGCTAA GCCTGGGCAC TQGACTTACA CCCTSAACAA TACCCATCAT 1860 

TCTCTGCAAG OCCTGAAAGT GACAGTGACC TCTCGOGCCT CCAACTCAGC TGTGCCCCCA 1920 

GCCACTGTGG AAGCCTTTGT GGAAAGAGAC AGOCTCCATT TTCCTCATCC TGTGATQATT 1980 

TAT6CCAATG TCAAACAGG6 ATTTTATCCC ATTCTTAATG CCACTGTCAC TGCCACAGTT 2040 

6AGCCAGAGA CTOGAGATCC TGTTACGCTG AGACTCCTT6 ATGATG6AGC AGGT6CTGAT 2100 

GTTATAAAAA ATX5ATGGAAT TTACTCGAGG TATTTTTTCT CCTTTGCTGC AAATG6TA6A 2160 

TATA6CTTGA AAGTGCATGT CAATCACTCT CCCAGCATAA GCACCCCAGC CCACTCTATT 2220 

CCAGGGAGTC ATGCTATGTA TGTACCAG6T TACACAGCAA AOGGTAATAT TCAGATGAAT 2280 

GCTCCAAGGA AATCAGTAGG CAGAAATGAG GAGGACOGAA AGTGGGGCTT TA6CCGA6TC 2340 

AGCTCAGGAG GCTCCTTTTC AGTGCTGGGA GTTCCAGCTG GOCCCCACCC TGATGTGTTT 2400 

CCACCATGCA AAATTATTGA CCTGGAAGCT GTAAAAGTAG AAGAGGAATT GACCCTATCT 2460 

TGGACAGCAC CTQGAGAAGA CTTTGATCAO GGCCAGOCTA CAAGCTATCA AATAAGAA7X3 2520 

AGTAAAAGTC TACAGAATAT CCAAGATGAC TTTAACAATG CTATTTTAGT AAATACATCA 2580 

AAGCGAAATC CTCAOCAAOC TGGCATCAG6 GAGATATTTA OGTTCTCACC CCAGATTTCC 2640 

A06AAIGGAC CTGAACATCA GCCAAATGGA GAAACACATG AAAGCCACAG AATTTATG7T 2700 

6CAATACGAG CAAT6GATAG 6AACTCCTTA CAGTCTGCTG TATCTAACAT TGCCCAGGOS 2760 

CCTCTGTTtA TTGOCCCCAA TTCTGATCCT GTACCTGCCA GAGATTATCT TATATTGAAA 2820 

Q6ASTTTTAA CA6CAATGG0 TTTGATAGGA ATCATTTGCC TTATTATAGI TGTGACACAT 2880 
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omkcrmA gcaggaaaaa gagagcagac aagouuusaga atqqmcmh attattatka 2940 

ATAAATATCC AAAGT6TCTT CCTTCTTA6A TATAAGACCC ATG6CCTT06 ACTACAAAAA 3000 

CATACTAACa AAGTCAAATT AACATCAAAA CTGTATTAAA ATGCATTSAG TTTTTGTACa 3060 

ATACAGATAA GATTTTTACA TGGTAGATCA ACAATTCTTT TTGGGG6TAG ATTAGAAAAC 3120 

CCTTACACTT TGGCTATGAA CAAATAATAA AAATTATTCT TTAAAGTAAT GTCTTTAAAG 3180 

GCAAASGGAA GG6TAAAGTC GGACCAGTGT CAAGGAAAGT TTGTTTTATT GA66TGGAAA 3240 

AATAGCOCXA AGCAGAGAAA AGGAGGGTAG GTCT6CATTA TAACTGTCTG TGT6AAGCAA 3300 

TGATTTRGTT ACTTTGATTA ATTTTTCTTT TCTCCTTATC TGTGCAGTAC AGGTTGCTTG 3360 

TTTAC31TGAA GATCATGCTA TATTTTATAT ATGTAGCCCC TAATGCAAAG CTCTTTACXrr 3420 

CTTGCTATTT TGTTATATAT ATTTCAGATG ACATCTCCCT GCTAATGCTC AGAGATCTTT 3480 

TTTCACTGTA A6AGGTAACC TTTAACAATA TGG6TATTAC CTTlGrCTCT TCATAOOGOT 3540 

TTTATGACAA AGGTCTATTG AATTTATTT3 INTGTAAGTT TCTACTCCCA TCAAAGCAOC 3600 

TTTCTAAGTT TATTGCCTTG GGTTATTATQ 6AATGAXAGT TATAGOOCQI TATAATGOCT 3660 
TACCTAGGAA A 

Seq ID NO: 519 Protein sequence 
Protein Accession i: NP_006527.l 

1 11 21 31 41 51 

I I I I I I 

MTQRSIAGPI CMLKFVTLLV ALSSELPFLG AGVQLQDMGY NGZiLlAINPQ VPEN<^ISN 60 

IKEMITBASF YLFNATKRRV FFRKIKILIP ATWICANNNSK IRQBSYEXAN VIVTBWYGAH 120 

GDDPYTLQYR GCGKEGKYIH FTPMPLLNDN LTAGYGSRGR VFVHEWAHLR WGVFDBYNND 180 

KPPyiNGQNQ IKVTRCSSDI TGIFVCEKGP CPQENCIISK LPKEGCTFIY NSTQNATASI 240 

MFMQSLSSW EFQIASTHNQ EAPNLQNQMC SLRSAWDVIT DSADPHHSFP MNGTELPPPP 300 

TPSLVQAGDK WCLVIJ3VSS KKAEADRI1I.Q LQQAAEPYLM QIVBIHTFVG lASFDSKGEI 360 

RAQLHQINSN DDRKI*LVSYL PTTVSAKTDI SICSGLKKGP BWEKUJGKA YGSVMILVTS 420 

GDDKLLOfCL PTVX^SGSTl HSXALGSSAA PNLKELSRLT GGLKFFVPDI SNSKSMIDAF 480 

SRISSGTGDI PQQHIQLEST GBNVKPHHQL KNTVTVDNTV GNDTMFLVW QASGPPEIIL 540 

PDPDGRKyyr NNPITNLTPR TASLWIPGTA KPOHWTYTLII NTHHSLQAIiK VTVTSRASNS 600 

AVPPATVEAF VERDSLHPPH PVMIYANVKQ GPypILNATV TATVEPETGD PVTLRLLDDG 660 

A6ADVIKNDG lYSRYFFSFA AKGRYSUCVH VNRSPSISTP AHStPGSHAM YVPGYTANOf 720 

ZQMHAPRKSV QBJSSSBRSSHQ FSRVSSG6SF 6VLGVPAGPB PDVFFFCKIX DLEAVKVSEE 780 

LTLSHTAFGE DFDQGQATSy BZaMSKSLQN IQDDFNMAIL VNTSKRNPQQ AGZRBIFTFS 640 

PQISTK6PEH QP8SBTHBSH RIYVAZRAMD BKSIiQSAVSN lAQAPLFIPP HSDFVPABDY 900 
LZLKGVLTAM GLX6ZZCLIZ WTBBTIiSRR XRADKKENGT KLL 

Seq 10 MO: 520 SNA sequence 

Nucleic Acid Accession #: NN_000226.1 

Coding sequence: 82.. 3 600 

1 11 21 31 41 51 

I I I I I I 

GCTTTCAG6C GATCTG6AGA AAGAA06GCA GAACACACAG CAAGGAAAGG TCCTTTCT60 60 

GGATCACCCC ATTGGCTGAA GATGAGACCA TTCTTCCTCT TGTGTTTTGC CCTGCCTGGC 120 

CTCCTGCATG CCCAACAAGC CTGCTCCCGT GGGGCCTGCT ATCCACCTGT TGGGGACCTG 180 

CTTGTTGGGA GGACCCX5GTT TCTCCGAGCT TCATCTACCT GTGGACTGAC CAAGCXTTGAG 240 

ACCTACT6CA COCAGTATGO CXSAGTOOCAG ATQAAATGCT GCAAGT6TGA CTCCAGGCAG 300 

CCTCACAACT ACTACAGTCA CCX3A6TAGAG AATGTGGCTT CATCCTC O GG CCCCATG06C 360 

TGGTGGCAGT CCCAGAATGA TGTGAACCCT GTCTCTCTGC AGCTGGACCT 6GACAGGAGA 420 

TTCCAGCTTC AAGAAGTCAT GATGGAGTTC CAGGGGCCCA TGCCCGCOGQ CATGCTGATT 480 

GA606CTCCT CAGACTTC6G TAAGACCTGG 0GA6TGTACC AGTACCTGGC TGCCGACTGC 540 

ACCTCCACCT T O CCTO G GGT COGCCAGGGT OGGGCTCAGA GCTGGCAGGA TGTTOGGTGC 600 

CAGTOCCT6C CTCAGAGGOC TAATGCA06C CTAAATGGG6 GGAAGGTCCA ACTTAACCTT 660 

ATGQATTTAG TGTCTGGGAT TCCAGCAACT CAAAGTCAAA AAATTCAAGA GGTGGGGGAG 720 

ATCACAAACT TGAGAGTCAA TTTCACCAGG CTGGCCCCTG TGCCOCAAAO GGGCTACCAC 780 

CC7CCCAGCG CCTACTAT6C T6TGTCCCAG CTCOGTCTGC AGGG6AGCTG CTTCTOTCAC 840 

G6GCAT6CTG ATG6CTG06C ACCCAAGCCT GG66CCTCTG CAQGCCOCTC CACG6CTGTG 900 

CAGGTCCAC6 ATGTCTGTGT CTOCX»GCAC AACACTOCCG 6C0CAAATTG TGA60GCTGT 960 

GCACCCTTCT ACAACAACCG GCCCTGGAGA CCGGC6GAGG GCCAG6ACGC CCATGAATGC 1020 

CAAAGGTGCG ACTGCAATGG GCACTCAGAO ACATGTCACT TTGACCCCGC TGTGTTTGCC 1080 

GCCAGCCAGG GGQCATATGG A0GTGTGT6T GACAATTGCC GG6A0CACAC CGAAGGCAAG 1140 

AACTGIGAGC GGTGTCAGCT GCACTATTTC 0GGAAC08GC GCOOGGGAGC TTCC31TTCAG 1200 

GAGAGCTGCA TCTCCTGC6A GTGTGATCC6 GATGGGGCAG TG0CAGGG6C TCCCTGTGAC 1260 

CCAGTGACCG GGCAGT6TGT GTGCAA6GAC CATGTGCAGG 6A6AG0GCTG T6ACCTATGC 1320 

AAOCCGGGCT TCACTGGACT CACCTACGCC AACCCGCAGG GCXXXX^CCG CTGTGACTQC 1380 

AACATCCTGG 6GTCCCG6AG GGACAT6CC6 TGTGAC6A6G AGAGTG6G0G CTGCCTfTGT 1440 

CTOCCCAACG TG 6 TGG0T0C CAAAT6TGAC CAGTGT G CTC CCZACCACTG GAAGCTGGCC 1500 

AGTGGCCAGG GCTGTGAACC GTGT6CCTGC GACC06CACA ACTCCCCTCA GCCCACAGT6 1560 

CAACCAGTTC ACAGGGCAGT GCCCTGTOSa GAAGGCTTTG GTOGOCTGAT GTGCAGOGCT 1620 

GCAGCCATCC QCCAGTGTCC AQACCGGACC TATGGAGACG TGGCCACAGG ATGCCXSAGCC 1680 

TGTGACTGTG ATTTCOGGGG AACAGAGGGC C0GGGCT6C6 ACAAGGCATC AGGG06CTGC 1740 

CTCT6C06CC CT6GCTTGAC 06GGC0C08C TGTQACCAGT GOCAGCQAGG CTACT6CAAT 1800 

G6CTACXXX3G TGTGCGTG G C CTGCCACCCT TGCTTCCA6A CCTAT6ATGC GGACCTC06G 1860 

GAGCAGQCCC TGCGCTTTQG TAGACTOOQC AATGCCACCG CCAGCCTGTG GTC3W3GGCCr 1920 

GGGCTGGAOG ACCX3TGGCCT GGOCTCCCGG ATCCTAGATG CAAAGAGTAA GATTGAGCAG 1980 

ATCCGAGCAG TTCTCACCAG CCCC6CAGTC ACAGA6CAGG AGGTG6CTCA GGTQG CCAGT 2040 

6CCATCCTCT OCCTCAGGOG AACTCTOCAG OGCCTGCAGC TGGATCTOCC CCTQGAGGAG 2100 

GAGAGGTTGT COCTTCOGAG A6AGCTGGA6 AGTCTTGACA 6AA6CTTCAA IGGTCTCCTT 2160 

ACTATGTATC AGAGGAAGAG GGAGCAGTTT GAAAAAATAA GCAGTGCTGA TCCTTCAGGA 2220 

GCCTTCCGQA TGCTGAGCAC AGCCTAOGAG CAGTCAGCCC AGGCTGCTCA GCAGGTCTCC 2280 

6ACA6CTCGC GOCTTTTGGA CCA6CTCAGG GACAGC066A GAGAG6CAGA GAOQCTGGTO 2340 

GGGCAGGOGG GAGGAOQAOG AQGCACOQGC A6CCCCAA6C TTGT66CCCT QA0QCTG6A0 2400 

ATGTCTTGGT TGCCTGACCT GACACOCAOC TTCAACAA6C TCTGTGGCAA CTCCAGGCAG 2460 

ATGGCTTGCA OCCCAATATC ATGCCCTGGT GAGCTATGTC CCCAAGACAA TGGCACAGCC 2520 

TGTGGCTCCC GCTGCAGGQQ TGTCCTTCCC AGCSKXaSGTG GGQCCTTCTT GATGGCGGGG 2580 

CAGGTGGCTO AGCAGCTGCG G6GCTTCAAT GCOCAGCTCC A60GGACCAG GCAGAIGATT 2640 
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AG66CAGC03 AGGAATCT6C CTOVCAGATT CAATCCAGTQ CGCA606CTT QGAGACCCAG 2700 

GTGAGOGCCA QCCGCtCCCA GA.TGGAGGAA GftTOTCAGAC 6CACA066CT OCTAATCCAG 2760 

CaCGTCOGGG ACTTCCTAAC AGAOCOOGAC ACTGATGCAG CCACTATCX31 GGAGGTCAGC 2820 

GAGGCa?rGC TGGCCCTGTG GCTGCCCACA GACTCAGCTA CTX3TTCTGCA GAAGATGAAT 2880 

GAGATCCAOS CCATT6CAGC CAGGCTCCCC AACGTGGACT TGGTGCTGTC CCAGACCAAG 2940 

CAG6ACATT0 CG08T0CC00 CGOGTTGCAO GCTGAGGCTQ AG6AA6CGA0 GAGCCGA6CC 3000 

' CATGCAGTGQ A6GGCCA0GT GGAAGATCrTG OTTGQGAACC TQCGGCAGGQ GACA6T66CA 3060 

CTGCAGGAAG CTCAGGACAC CATGCAAGGC ACCA0CC3GCT CCCTTCGGCT TATCCAGGAC 3120 

AGGGTTGCTG AGGTTCAGCA GGTACTGCGG CCAGCAGAAA AGCTGGTGAC AAGCATGACC 3180 

AAGCA6CT06 GTGACTTCTG GACAGGGATG GAGGA6CTCC GCCACCAAGC CGGGCAOCAO 3240 

GGGGCAGAGG CAGTOCAGGC CCAGCAGCTT OOSGAAGGrG CGAGOGAOOl GQCATTGAGT 3300 

GCCCAAGAGO GATTTQAOAG AATAAAACAA AAGTHTQCTa AGTTOAAGGA OOGOTTG OG T 3360 

CAGAGTTCCA TGCTGGGTGA GCAGGGTGCC CGGATCCAGA GTGTGAAGAC AGAGGCAGAG 3420 

GAGCTGTTTG GGGAGACXIAT GGAGATGATG GACAGGATGA AAGACATG6A GTTGGAGCTG 3480 

CTGCGGG6CA 6CCAGGCCAT CATGCTGCGC TCGGOGGAOC TGACAGGACT GGAGAAGOGT 3540 

GTGGAGCAQA TCOGTGAOCA CATCAATGGG OGCGTQCTCT ACTAT6CCAC CTGC2UISTGA 3600 

TGCTACA6CT TCCAOOOOST TGGOOCACTC ATCTGOOQCC TTT6CTTTTQ GTTG6G66CA 3660 

GATTQQGTTa GAATGCTTTC CATCTCCAGG AGACTTTCAT GCAGCCTAAA GTACAGCCTG 3720 

GACCACCCCT GGTGTGTA6C TAGTAAGATT ACCCTGAGCT GCAGCTGAGC CTGAGCCAAT 3780 

GGGACAGTTA CACTTGACAG ACAAAGATGG TGGAGATTG6 CATGCCATTG AAACTAA6AG 3840 

CTCTCAAGTC AAG6AAGCTO GGCTGGGCAG TATOOCQOGC CTTTAGTTCT CCACTGQGGA 3900 

GGAATCCT06 ACXAAGCACA AAAACTTAAC AAAAGTGAT8 TAAAAAT6AA AAGOCAAATA 3960 
AAAATCTTTG G 

Seq ID NO: 521 Protein sequence 
Protein Accession #: NP_000219.l 

1 11 21 31 " 41 51 

I 11 I I I 

MHPFFLIiCFA LPOIiIiHAQQA CSRGAOTPV GDXiLVGRTRF LRASSTCGLT KPETYCTQYG 60 

EKQMKCCKCD SRQPBHYYSH RVENVASSSG PMRWWQSQKt) VNPVSLQLDL ORHFQIiQEVM 120 

MEFQGPMPAG MLIERSSDP6 RTWRVyQYLA ADCTSTFFRV RQGRPQ5WQD VROQSLPQRP 180 

NARLNGCTVQ UILMDLVSGI PATQSQKIQB VGBITNLRVH PTRLAPVPQR GYHPPSAYYA 240 

VSQLRLQGSC PCHGHADRCA PKPGASAGPS TAVQVHDVCV OQHNTAGPMC ERCAPPYNNR 300 

PKRPAEGQDA HECQRC3X3fG HSBTCBPDPA VFAASQGAYG GVCDKCROHT BGKHCERCQL 360 

HYFRJJRRPGA SIQETCISCE CDPDGAVPGA PCDPVTGQCV CKEHVQGERC DLCKPGPTGL 420 

TYANPQGCBR CDOflLGSRR UKPCDBBSGR CZ/CLFUWGP KCDQCAPYHH KZJ^SGQGCEP 480 

CACDPHNSPQ PTVQPVHRAV PCRBGFGGLM CSAAAIRQCP DRTVCTVATG CRACDCDFRG 540 

TBGPGCDKAS GRCIiC3lPGLT GPRC2)QCQRQ YOJRYPVCVA OIPCFQTYDA DI4RBQALRFG 600 

RLRNATASLW 8GPGLBDRGL A5RILDAKSX lEQIRAVLSS PAVTEQEVAQ VASAILSLRR 660 

TLQGLQLDLP LEEBTLSLPR DLESLDRSFK GLLTMYQRKR BQFEKISSAD PSGAFRKLST 720 

AYEQSAQAAQ QVSDSSRLLD QLRDSRREAE RLVRQAGGG6 GTCSPKLVAL RLEMSSLPDL 780 

TPTFNKL06N SRQMACTPIS CPGBLCPQDH GTACGSRCRG VLPRAGGAPL MAGQVAEQLR 840 

GFNAQLQRTR QMIRAAEESA SQIQSSAQRL ETQVSASRSQ MEEaJVRRTRL LIQQVRDPLT 900 

DPDTDAATIQ EVSEAVLALW LPTDSATVLQ KKNEIQAIAA RI*PNVDLVLS QTKQDIARAR 960 

RLQAEAEBAR SRAHAVGGQV EDWGNLRQG TVALQEAQDT h^GTSRSLRL IQDRVAEVQQ 1020 

VLRPAEKLVT SMTKQLGDPH TRMEELRHQA RQQGAEAVQA QQLABGASBQ ALSAQBGPER 1080 

ZKQKYAELRD RZ1GQS8MZ.6B QOARZQSVKT BAEBLF6RTM EMKDRMKDMB LEUtRGSQAZ 1140 
MLRSADLTQL SXRVEQIRDH IKGRVLYYAT CK 

Seq ID HO: 522 DMA Sequence 

Zfucleic Acid Accession #: im_001944.1 

Coding sequence: 84.. 3083 ~ 

1 11 21 31 41 51 

I I I 1 1 I 

TTTTCTTAOA CATTAACTGC A6AC0GCTGG CA6GATAGAA GCAG066CTC ACTTGGACTT 60 

TTTCAOCAGO GAAATCAGAO ACAATGAT66 OGCTCrTOCC CAGAACTACA GGGGCTCT66 * 120 

CCATCTTOGT GGTGGTCATA TTGGTTCATG GAGAATT6G6 AATAGAGACT AAA6GTCAAT 180 

AT6ATGAAGA AGA6ATGACT AT6CAACAAG CTAAAAGAAG GCAAAAACGT GAATGGGTGA 240 

AATTTGCCAA ACCCTOCAGA GAAGGAGAAO ATAACTCAAA AAGAAACCCA ArTGCXSUlGA 300 

TTACTTCABA TTACCAAGCA AGCCAGAAAA TCACCTACOG AATCTCTG6A 6T66GAATCX3 360 

ATCAGC06GC TTTT6GAATC mUTlVnta ACAAAAACAC TGGAGATATT AACATAACA6 420 

CTATAGT06A COGGGAGGAA ACTCCAAGCT TCCTGATCAC ATGTGOGGCT CTAAATGCCC 480 

AAGQACTAGA TGTAQAGAAA CCACTTATAC TAAC3GGTTAA AATTTTGQAT ATTAATGATA 540 

ATCCTCCAGT ATTTTCACAA CAAATTTTCA TGGGTGAAAT TGAA6AAAAT AGTGCCTCAA 600 

ACTCACTGOT QATGATACTA AATGCCACAG ATGCAGAtEGA ACCAAACCAC TTGAATTCTA 660 

AAATTGCCTT CAAAATTGTC TCTCAGGAAC CAGCA66CAC ACOCATGTTC CTCCTAAGCA 720 

6AAACACTGG GGAAGTC06T ACTTTGACCA ATTCTCTTGA CC6AGAGCAA GCTAGCA6CT 780 

ATCGTCTGGT TGTGAGTGGT GCAGACAAAG ATGGA6AAGG ACTATCAACT CAAT6TGAAT 840 

GTAATATTAA AGTGAAAGAT GTCAAOaATA ACTTCCCAAT GTTTAGAGAC TCTCAGTATT 900 

CA8CA0GTAT TX3AAGAAAAT ATTTTAAGTT CTGAATTACT TCGATTTCAA GTAACAGATT 960 

TGGATGAAGA GTACACAGAT AATTGGCTTG CAGTATATTT ClTl' A OCl'Cr GGGAAT6AA6 1020 

6AAATTQGTT TOAAATACAA ACTGATCCTA 6AACTAATGA A66CATCCTG AAAGTGGTGA 1080 

AG6CTCTAGA TTAT6AACAA CTACAAAGOG TGAAACTTAO TATTGCT G TC AAAAACAAAG 1140 

CTGAATTTCA CCAATCAGTT ATCTCTOGAT AOOGAGTTCA GTCAACCCCA GTCACAATTC 1200 

AGGTAATAAA TGTAAGAOAA GGAATTGCAT TG08T0CTGC TTCCAAGACA TTTACTGTGC 1260 

AAAAA06CAT AAGTAGCAAA AAATTGGTQO ATIA1ATCCT GG6AACATAT CAAGOCATOO 1320 

ATGAG6ACAC TAACAAAGCT 6CCTCAAAT6 TCAAATA1GT CATG08A06T AAOGATGGTQ 1380 

GATACCTAAT GATTGATTCA AAAACTGCTG AAATCAAATT T6TCAAAAAT ATGAAOOGAG 1440 

ATTCTACTTT C31TAGTTAAC AAAACAATCA CAGCTGAGGT TCTGGCCATA GATGAATACa 1500 

OGGGTAAAAC TTCTACAGGC A06GTATATG TTAGAGTACC OGATTTCAAT GACAATTGTC . 1560 

CAACAGCTGT CCTG6AAAAA GATSGAGTrT GCAGTTCTTC AOCTTC300XQ GTTGTCTCGG 1620 

CTAOAACACT GAATAATASA TACACTQGOC CCTATACATT T6CACTG6AA GATCAACCTG 1680 

TAAAGTTOCC TGCOGTATGG AGTATCACAA CCCTCAATGC TACCTCGGCC CTCCTCAGAO 1740 

CCCAGQAACA GATACCTCCT GGAGTATAOC ACATCTCCCT GGTACTTACA 6ACAGTCAGA 1800 

ACAATOGGTG TBAGATGCCA 0GCAGCTT6A GACTGGAAOT CTGTCAGTOT 6ACAACAGGG 1660 
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6CATCTGTG0 AACTTCTTAC OCAAOCACAIl G0CCT6GGAC 
CAGOGAGGCT GOGGCCTGCC GCC^TCGGOC TGCT Q CTOCT 
TGGCCCCCXrr tctgctgttq acctgtcsact gtggggcagg 

GTGGTTTTAT CCCAGTTCCT GATGGCTCAG AAGGAACAAT 
GAGCCCATOC TCAAGACAAG 6AAA.TCAGAA ATATTT6TGT 
GAGCOGATTT CATGGAAAGT TCTGAA6TTT GTACAAAXAC 
TCSGAAOGCAC TTCAGGAATG GAAATGACCA CTAAGCTTGO 
GTGCTGCAC^ CTTT6CAACA G6GACAGT6T CAGGA6CTGC 
CTGGAGTTGG CATCTGTTCC TCAGGGCAGT CTGGAACCAT 
GA6GAACCAA TAAGGACTAC GCTGATGGG6 OGATAAGCAT 
TTTCTCAQAA AGCATTTGOC TCTGO6GAO0 AAGA0GAT66 
T6TTGATCTA TGATAATGAA GG0GCAGAT6 CCACTG6TTC 
GTTGCAGTTT TATTGCTGAT GACCTGGATG ACAGCTTCTT 
TTAAAAAACT TGCAGAGATA AGCCTTGGTG TTGATGGTGA 
CXrrCTAAAGA CAGCX36TTAT GG6ATTGAAT CCTGTG6CCA 
CA06ATTTGT TAAGTQCCAa ACTTT6TCA0 GAAGTCAAGG 
CTGGGTCTOT CCAGCCAGCT GTTTCCATOC CTGACXXTCT 
TAACGGAGAC TTACTCGGCT TCTGGTTCCC TCX5TGCAACC 
CACTTCTCAC ACAAAATGTG ATAGTGACAG AAAGGCTTGAT 
CTGGCAACCT AGCTGGCCCA AC36CAOCTAC GA6G6TCACA 
ATCCTTGCTC COGTCTAATA TGAOCAOAAT 6A6CTG6AAT 
ATCTTTGGAC TAAAGTATTC AAAATA6CAT AGCAAAGCTC 
TGGCACTTAT TAGCTTCTCT CATAAACTGA TCACGATTAT 
TACCCCAAAA GCAATATGTT GTCACTCXTTA ATTCTCAAGT 
TCTTAAAGTT TTTCAAAACC CTAAAATCAT ATTOSC 

Seq ID NO: 523 Protein sequence 
Protein Accession #: NP_001 935.1 



PCT/US02/12476 



MMGLTPRTTG 
GBDNSKRNPI 
PSFIiITCRAL 
ATDADEPNHL 
DKD6BGLSTQ 
WLAVYFFTSG 
SRYRVQSTPV 
SNVKYVK6R17 
VYVRVPDPiro 
ITTLNATSAL 
TTSPGTRYGR 



MTTKLGAATE 
DGAISMNFLD 
LDDSPLDSLG 
LSGSQGASAL 
VTERVICPIS 



11 
I 

ALAIFVWIL 
AKZTSDYQAT 
KAQQLOVEKP 
NSKIAFKIVS 
C8CNIKVKDV 
NE6NWFEIQT 
TZQVZNVREO 
DGGYLMIDSK 
NCPTAVLEKD 
LRAQE5QIPPG 
PHS6RLGPAA 
ZBOAHPEDKB 
SGGAAGFATG 
SYFSQKAFAC 
PKFKKLAEIS 
SASG5VQPAV 
SVPOILAGPT 



21 
I 

VHGELRIETK 
QKITYRISGV 
LXLTVKILDI 
QEPAGTPMFIj 
NDNFPMFRDS 
DPRTNS6ILK 
ZAFRPASKTF 
TAEIKFVEMM 
AVCSSSPSW 
VYHISLVLTD 
IGLLLLGLLL 
ITNICVPPVT 
TVSGAASGFG 
AEEDDGQEAN 
LGVDGEGKEV 
SIPDPLQHGtT 
QLRGSHTMLC 



31 

1 

GQYDEEEMTM 
GIDQPPFGIP 
NDNPPVFSQQ 
LSRKTGEVRT 
QYSARIEE3II 
WKALDYEQIi 
TVQKGISSKK 
KRDSTFIVNK 
VSARTL10«RY 
SQNNRCEMPR 
LLLAPLLLLT 
ANQADFMBSS 
AATGV6ICSS 
DCLLIYDKB6 
QPPSKDSGYG 
YLVTBTYSAS 
TEDPCSRLZ 



Seq ID NO: 524 DMA sequence 

Nucleic Acid Accession #i XM_058069.2 

Coding sequence : 1 . . 1413 



ATGAAGTTTC 
AGCTCTACAA 
TAT6GCCTT6 
AAGGAAAAAA 
ACATCTAOCC 
AGGGAAATGC 
TACACACCTG 
TGGAGTAATO 

CTAGCCC3VTG 
GAATTCTGGA 
G6CCATT0CT 
AAATATGnnEG 
CTGTATCKSAG 
CTCTGTGACC 
TTCAAA6ACA 
ATTTCTTCCT 
AGAAATCAAG 
GAGCCAAATT 
GATGCAGCTG 
TCGAGGTATQ 
AACTTCCAAG 
TATTTCTTCC 
ACACTGAAAA 



11 

I 

TTCTAATACT 
GCCTGGAAAA 
A6ATAAACAA 
TCCAAGAAAT 
TGGAGAT6AT 
CAGGGGGGCC 
ACATGAACCG 
TTACCCOCTT 
CCCGTGGAOC 
CTTTTOGACC 
CTACACATTC 
TAGGTCTT6G 
ACATCAACAC 
ACCCAAAAGA 
CCAATTTGA6 
GGTTCTTCTG 
TAT6GCCAAC 

ATCCCAAGAG 
TTTTTAACCC 
ATGAAA6GAG 
GAATOGGGCC 
AAQGATCTAA 
GCftATAGCnS 



21 

I 

GCTCCTGCAG 
AAATAATGT6 
ACTTCCAGT6 
GCAOCACTTC 
GCACGCACCT 
CGTATGGA6G 
TGAG6AT6TT 
GAAATTGAGC 
TCATGGA6AC 
TGGATCTGGC 
AGGAG6CACA 
CCATTCTAGT 
ATTTC360CTC 
GAACCAA06C 
TTTTQATGCT 
GCTGAAGGTT 
CTTGOCATCT 
TAAAGATSAC 
CATACATTCT 
AOGTTTTTAT 
ACAGATGATG 
TAAAATTGAT 
CCAATTTGAA 



31 

1 

GCCACTGCTT 
CTATTTGGTG 
ACAAAAATGA 
TTGGGTCT6A 
CGATGTGGAG 
AAACATTATA 
GACTACGCAA 
AAOATTAACA 
TTCCATGCTT 
ATTGGAGGGG 
AACTTGTTCC 
GATCCAAAGG 
TCTGCTGATO 
TTGCCAAATC 
6TCACTA006 
TCTGAGAGAC 
GGCATT6AA6 
AAATACTGGT 
TTTGGTTTTC 
AGGACCTACT 
GACCCTGGTT 
GCAGTCTTCT 
TATGACTTCC 



GAGGTATGGC 
TSGTCTCCTG 
TTCTACTGGG 
TCATCAGTGG 
GOCTCCTGTA 
GTATGCCAGA 
A6CAGCCACT 
TTCAGGATTC 
GAGAACAAGG 
GAATTTTCTG 
CCAGGAA6CA 
TCCIGTGGGC 
GGACTCACTT 
AGGCAAAGAA 
TCCCATAGAA 
AGCTTCTGCT 
GCA6CATGQT 
TTCCACTGCA 
CTGTCCCATT 
TACTATGCTC 
ACCftCACIGA 
ACTGTATTGG 
AAATTAAATG 
ACIATTCAAA 



41 

I 

QQAKRRQKRE 
WDKMTGDIN 
IFM6EIEENS 
LTN5LDREQA 
LSSELIiHFQV 
QSVKLSIAVK 
LVDYIIjGTYQ 
TITAEVLAID 
TGPYTFALED 
SLTLEVCQCD 
cd06agstgg 
BVCINTYARG 
GQSGTMRTRR 
ADATGSPVGS 
lESOGHPIBV 
GSLVQPSTAG 



41 

I 

CTGGAGCTCT 
AAAGATACTT 
AATATA6TGG 
AAGTGACCG6 
TCCCOGATGT 
TCACCTACAG 
TCCGGAAA6C 
CAG6CATGGC 
TTGATG6CAA 
ATGCACATTT 
TCACTGCTOT 
COGTAATGTT 
ACATAGGTG6 
CTGACAATTC 
T6GGAAATAA 
CAAAGACCAG 
CTGCTTATGA 
TAATTAGCAA 
CTAACTTTGT 
TCTTTGTAGA 
ATCCCAAACT 
ACTCTAAAAA 
TACTOCAAlOG 



AGGCCGCACT 
CTGCTGCTGT 
G6AGTGACAG 
GGAATTGAAG 
ACAGCCAATG 
GGGACA6G66 
GAATCIGGAG 
GGA6CASCCA 
CATTCCACTG 
GACTCCTACT 
AATGACTGCT 
T006T6GGTT 
G6ACCCAAAT 
GTTCAGCCAC 
GTCCASCAGA 
TT8TCCG0CT 
AACTATTTAO 
GGCTTTGATC 
TCCAGTGTTC 
7GTACAGASG 

cx:aaatctg6 
6ctaataatt 

TTTGGGTTCA 
TTGTAGTAAA 



51 
I 

WVXFAKPG!R£ 
ITAIVDREET 
ASNSLVMILN 
SSYRLWSGA 
TDIDEEYTDN 
NKAEFHQSVI 
AIDEDINKAA 
EYT6KTSTGT 
QPVKLPAVWS 
NRGICX3TSYP 
VTGGFIPVPD 
lAVEGTSGMB 
STQGTNKDYA 
VGCCSPIADD 
QQTGFVKCQT 
FDPLLTQMVI 



51 
I 

TCCCCTGAAC 
AGAAAAATTT 
AAACTTAATG 
6CAACTG6AC 
CCATCATTTC 
AATCAATAAT 
TTTCCAAGTA 
T6ACATTTTG 
AG6T66AATC 
CGATGAGGAC 
TCACGAGATT 
CCCCACCTAC 
CATTOUSTCC 
AGAACCA6CT 
GATCTTTTTC 
TGTTAATTTA 
AATT6AAGCC 
TTTAAGAOCA 
GAAAAAAATT 
TAACCAGTAT 
GATTACCAAG 
CAAATACTAC 
TATCACCAAA 



1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
7B0 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



Seq ID NO: 525 Protein sequence 
Protein Accession #: P39900 

11 



1 11 21 31 41 51 

I I I I i I 

MKFIiLILLLQ ATASGAI*PUI SSTSI*EK1QIV LPGERYLEKF YGLBINKLPV TKMKYSGKIiM 
KSKZQEMQHP LGLKVTGQLD TSTLS4MBAP ROGVPDVHHF RBtPGGPVWR KHYITYRINN 



60 
120 
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YTSOKNREDV DYAIRKAFQV WSHVTPItKFS KnSTGMADIL WKARGAB6D FKAFDGKCGZ 180 

IiAHAFGFGSG I06DAHFDED BFWTTHSGGT NLFLTAVHBZ GBSUSUSSiSS DPKAVKFPTY 240 

KYVDINTFRIr SADDIRGIQS LYCTPKEMQR LPNPDNSEPA LCDPNLSPnA VTTVGNKIPF 300 

PKDRFFWLKV SERPICTSVNL ISSLWPTLPS GIEAAYEIKA RNQVFLFKDD KYWLISNLRP 360 

EnrrPKSZRS FGFPKFVKKI DAAVFNPRFY RTYFFVISNQy HRYDERRQMM DPGTPKLITK 420 
HFQGIGPKID AVFYSKtlRYy YFFQGSMQFB YDFUiQRITR TLKSNSWFGC 

Seq ID NO: 526 DNA sequence 

Nucleic Acid Accession g: NM_024423.1 

Coding sequencei 64.. 2590 

1 11 21 31 41 SI 

I t I I I I 

GGC3W3GTCTC GCTCTOSGCA CCCTCCOGGC GCCCGCGTTC TCCTGGCCCT GCCCGGCATC 60 

CGGATGGCOG C06CTGG6CC C066G6CTCC GTGOSOGGAG CCGTCTGCCT GCATCTGCT G 120 

CTGACCCTG6 T6ATCTTCA6 TCGTGATGGT GAA60CT6CA AAAAGGTGAT ACTTAATGTA IBO 

CCTTCTAAAC TAGAG6CAGA CAAAATAATT G6CAGAGTTA ATTTGGAAGA GT6CTTCA6G 240 

TCTGCAGACC TCATCC3GGTC AAGTCATCCT GATTTCAGAG TTCTAAATGA TGGGTCAGTG 300 

TACACAGCCA GGGCTGTTGC G C TGTCTGAT AAGAAAAGAT CATTTACCAT ATCGCTTTCT 360 

GACAAAAOGA AACAGACACA 6AAAGAGGTT ACTGTGCTGC TAGAACATCA GAASAAGGTA 420 

TOGAAGACAA GACACACTAG AGAAACTGTT CTCA660GTG CCAA{3AGQA6 ATGG6CACCT 480 

ATTOCTTGCT CTATGCAAGA GAATTCCTT6 GGCCCTTTCC CATTGTTTCr TCAACftAGTT 540 

GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGAOG TGGAGTTGAT 600 

AAAGAACCTT TAAATTTGTT TTATATAGAA AGAGACACTG 6AAATC7ATT TTGCACTCGG 660 

CCTGTGGATC GTGAAGAATA TGATGTTTTT GATTTGATTG CTTATG06TC AACTGCA6AT 720 

GGATATTCAO CAGATCTSOC CCTCCCACTA CCCATCAGGG TAGAGGATGA AAATGACAAC 780 

CACCCTGTTT TCACAGAAGC AATTTATAA7 TTTGAAGTTT TGGAAAGTAO TAGACCTGGT 840 

ACTACAGTGG GGGTG6TTTG TGCCACAGAC AGAQATGAAC" OGGACACAAT GCATACX30GC " 900 

CTGAAATAOl GCATTTTGCA GCAGACACCA AGGTCACCTG GGCTCTTTTC TGTGCATCCX: 960 

AGCACAGGOQ TAATCACCAC AGTCTCTCAT TATTT6GACA GAGA6GTTGT AGACAAGTAC 1020 

TCATTGATAA TGAAAGTACA AGACATGGAT GGCCAGTTTT TT6GATTGAT AOGCACATCA 1060 

ACTTGTATCA TAACAGTAAC AGATTCAAAT GATAATGCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAGGA AAATGCATTC AATGTGGAAA TCTTAOGAAT ACCTATAGAA 1200 

GATAAGGATT TAATTAACAC TGCCAATTGG AGAGTCAATT TTACCATTTT AAAGGGAAAT 1260 

GAAAATGGAC ATTTCAAAAT CAGCACAGAC AAAGAAACTA ATGAA6GTGT TCTTTCTGTT 1320 

GTAAAGCCAC TGAATTATGA AGAAAACCGT CAAGTGAACC TGGAAATTGG AGTAAACAAT 1380 

GAAGCGCCAT TTGCTAGAGA TATTCCCAGA GTGACAGCC7 7GAACAGA6C CTTGGTTACA 1440 

GTTCATGTGA GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGOGG 1500 

ATTAAAGAAA ACTTAGCAGT GGGGTCAAAG ATCAAOGGCT ATAAGGCATA TGACCXTCXSAA 1560 

AATAGAAATG GCAATGGTTT AAGGTACAAA AAATTGCATG ATCCTAAAGG TTGGATCACC 1620 

ATTGATGAAA TTTCAGQ6TC AATCATAACT TCCAAAATCC TGGATAGGGA GGTTGAAACT 1680 

CXrCAAAAATG AGTTGTATAA TATTACAGTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TTGCTGTGAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATG GGGTATACCG ACATTTTA6C TGTTGATCCT 1860 

GATGAACCTG TCCATGGAGC TCCATTTTAT TTCAGTTTGC CCAATACTTC TCCAGAAATC 1920 

AGTAGACTGT GGAGCCTCAC CAAAGTTAAT GATACAGCTG CCCGTCTTTC ATATCAGAAA 1980 

AATGCTGGAT TTCAAGAATA TACCATTCCT ATTACTGTAA AA6ACAGG6C 0G6CCAAGCT 2040 

GCAACAAAAT TATT6AQAGT TAATCT6T6T GAATGTACTC ATOCAACTCA GT6T05T6CX3 2100 

ACTTCAAGGA GTACAG6A6T AATACTTG6A AAATGGGCAA TCCTT6CAAT ATTACTGGGT 2160 

ATAGCACTGC TCTTTTCTGT ATTGCTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

GGGAAACGTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACA6AAGCA 2280 

CCTQGAGAOG ATAGAGTGTO CTCT6CCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAOGTr TTTGTGGTAC TATGOGATCA GGAATGAAAA ATGGAGG6CA GQAAACCSITT 2400 

GAAAT6ATGA AAG6AGGAAA CXAGACCTTO GAATGCTGCC OGGGGGCTGO 0CA7CATGAT 2460 

ACCCTGGACT CCTGCAGGGG AGGACACAOG GA6GTG6ACA ACTGCAGATA CACTTACTGG 2520 

GAGTGGCACA GTTTTACTCA ACCXX3GTCTC GGTGAAGAAT CCATTAGAGG ACACACTGGT 2580 

TAAAAATTAA ACATAAAAGA AATTGCATOQ ATGTAATCAG AATGAAGACC GCATGGCATC 2640 

CCAAGATTAT GTCCTCACTT ATAACTATGA QGGAAGAGGA TCTCCA6CTG GTTCTGTGGG 2700 

CTGCTQCAGT GAAAAGCAGG AAGAAGATGG CCTTGACTTT TTAAATAATT TGGAACGCAA 2760 

ATTTATTACA TTAGCAGAAG CATGCACAAA GAGATAATGT CACAGTGCTA CAATTAGGTC 2820 

TTTGTCAGAC ATTCTGGAGG TTTCCAAAAA TAATATTGTA AAGTTCAATT TCAACATGTA 2880 

TGTATATGAT GATTTTTTTC TCAATTTTGA ATTATOCTAC TCACCAATTT ATATTTTTAA 2940 

AOCCAGTrOT TGCTTATCTT TTOCAAAAAG TGAAAAATGT TAAA ACAGAC AACTGGTA AA 3000 

TCTCAAACTC CA6CACTGGA ATTAAG6TCT CTAAAGCATC TGCTCTTTTT TTTTTTTAOG 3060 

GATATTTTAG TAATAAATAT GCTGGATAAA TATTAGTCCA ACAATAGCTA AGTTATGCTA 3120 

ATATCACATT ATTATGTATT CACTTTAAGT GATAGTTTAA AAAATAAACA AGAAATATTG 3180 

AGTATCACTA T6T6AAGAAA QTTTTGGAAA AGAAACAATG AAGACTGAAT TAAATTAAAA 3240 

ATGTTQCAOC TCA TAAAGA A TTGGQACTCA GOCCTACIGC ACTACCAAAT TCATTTGACT 3300 

TTGQA6QCAA AATGTGTTGA AGTGCCCTA7 (31AGTA6CAA TTTTCTATA6 GAATATAGTT 3360 

GGAAATAAAT GTGTGTCTGT ATATTATTAT TAATCAATGC AATATTTAAA ATGAAATGAQ 3420 

AACAAAGAGG AAAATGGTAA AAACTTGAAA TGAGGCTGGG 6TATAGTTTG TCCTACAATA 3460 

GAAAAAAGAG AGAGCTTCCT AG6CCTGG6C TCTTAAATGC TQCATTATAA CTGAGTCTAT 3540 

6AGGAAATAG TTCCTQTCCA ATTTGTGIAA TTTQTTTAAA ATTG1AAATA AATTAAACTT 3600 

TTCTGGTTTC TGTGG6AAGG AAATAGGGAA TCCAAT6GAA CAGTAGCTTT GCTTTGCAGT 3660 

CTGTTTCAAG ATTTCTQCAT CXACAAGTTA GTAGCAAACT GOGGAATACT CGCTGCAOCT 3720 

GGGGTTCCCT GCTTTTTQGT AGCAAGGGTC CAGAGATGAG GTOTTTTTTT OQGGGAGCTA 3780 

ATAACAAAAA CATTTTAAAA CTTACCTTTA CTGAAGTTAA ATCCTCTATT GCTGTTTCTA 3840 

TTCTCTCTTA TAOTGACCAA CATCTTTTTA ATTIAGATCC AAATAACCAT GTCCTOCTAG 3900 

AGTTTAGA6G CtAGAOSGAO CTGAGGGGAG GATCTTACTG AAA6CACCCT G60GA6ATTG 3960 

ATTGTCCTTA AACCTAAGCC CX3VCAAACTT GACACCTGAT CAGGTCTGGG AGCTACAAAA 4020 

TTTCATTTTT CTOCTCACTG CCCTTCTTCT GAGTGGCATT GOOCTGAATC AAGGAAAGCC 4080 

AG6CCTTGTG GGCOC^TTC TTTOGGCTTT CTGCTAAAQC AACACCTCCA GCA6AGATTC 4140 

OCTTAAGZGA CTOCAGGTTT TCCACCATOC nCAGOGTGA ATTAATTTTT AATCAGTTT6 4200 

CTTTCT0CA6 AGAAATTTTA AAATAATAGA AGAAATAGAA ATTTTGAAT6 TATAAAAGAA 4260 

AAA6A7CAAG T7GTCATT7T AGAACAGAGG GAACTTTGGG AGAAAGCA6C CCAAGTAGGT 4320 

TATTTGTACA GTCAGAGGGC AACAGGAAGA TGCAGGCCTT CAAGG6CAAG GAGAGGCCAC 4380 

AAGGAATAT6 66TGGGAGTA AAAGCAACAT C3GTCTGCTTC ATACTTTTTC CTAGGCTTGG 4440 



386 
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CACTGCCTTT TCCTTTCTCA GGCCMTG6C AACIGCCATT TGEAOTCOGGT GAGGQftTChO 4500 

CCAACCTCrr CTCTATGGCT CACCTTATTT GGAGTGftSAA ATCAAGGAGA CAGAGCTGAC 4560 

TGCATGATGA GTCTGAAGGC ATTTGCAGGA TGAGCCTGAA CTGGTTGT6C ASAACAAACA 4620 

AGGCATTCAT GGGAATTGTT GTATTCCTTC T6CAGCCCTC CTTCTGGGCA CTAAGAAGGT 4680 

CTATGAATTA AATGCCTATC TAAAATTCTG ATTTATTCCT ACaTTTTCTG TTTTCTAATT 4740 

TGACCCTAAA ATCTATGTGT TTTAGACTTA GACTTTTTAT TGCOCCCCCC OCCTTrTTTT 4800 

TTGAGAOGQl GTCTG6CTCT 6A0GCAC3U56 CTGGA6TGCA GTGGCTCC6A TCTCT6CTCA 4860 

CTGAAAGCTC OGCCTCCOGG GTTCAT6CCA TTCTCCTGCC TCAOCCTCCT GAGTAGCTGG 4920 

GACTAGAGGC GCCCACCACC AOGCXXX56CT AATTrTTTGT ATTTTTAATA GAGAOGGGGT 4980 

TTCACTGTGT TAGCCAGGAT GGTCTOGATC TCCTGACCTC GTGATOCXKX; TGCCTCGGCC 5040 

TCCCAAAOTC CXGGGATTAC AGCCATGACC CAOCGCTCCC G6CCTTGTTT TCOSTTTAAA 5100 

GTOGTCTTCT TTTAATGTAA TCATTTTGAA CATGTGTGAA AGTTGATCAT A0GAATT6GA 5160 

TCAATCTTGA AATACTC3UVC CAAAAGACAG TCOAGAAGCC AGGGGGAGAA A6AACTCAGG 5220 

GCACAAAATA TTGCTCTGAG AATGGAATTC TCTGTAAGCC TAGTTGCTGA AATTTCCTGC 5280 

TGTAACCAGA AGCCAGTTTT ATCTAAC6GC TACTGAAACA CCCACTGT6T TTTGCTCACT 5340 

CCCACTCAOC GATCAAAAOC TGCTACCTCC OCAAGACTTT ACTAGTGCOG ATAAACTTTC 5400 

TCAAAGAGCA ACCAGTATCA CV f C C LTCr r r TATAAAACCT CTAACCATCT CTTTGTTCTT 5460 

TGAACATGCT GAAAACCACC T6GTCTGCAT GTATOCCOGA ATTTGTAATT CTTTTCTCTC 5520 

AAATGAAAAT TTAATTTTAG GGATTCATTT CTATATTTTC ACATATGTAG TATTATTATT 5580 

TCCTTATATG TGTAAGGTGA AATTTATG6T ATTTGAGTGT GCAAGAAAAT ATATTTTTAA 5640 

AGCTTTCATT TTTCCCCCAG TGAATGATTT AGAATTTTTT AT6TAAATAT ACAGAATGTT 5700 

TTrrCTTACr TTTATAAGGA AGCAGCTGTC TAAAATGCAG TGGGGTTTGT ITTGCAATGT 5760 

TTTAAACAGA GTTTTAGTAT TGCTATTAAA AGAAGTTACT TTGCTTTTAA AGAAACTTGG 5820 

CTGCTTAAAA TAAGCAAAAA TTGGATGCAT AAAGTAATAT TTACAGATffT GGGGAGATGT 5880 

AATAAAACAA TATTAACTTQ OCTGCTTAAA ATAAGCAAAA ATTGGATGC3V TAAAGTAATA 5940 

TTTACAGATG TGGGGAGATG TAATAAAACA ATATTAACTT G6TTTCTTGT TTTTGCTGTA 6000 

TTTA6AGATT .AAATAATTCT AAGATGATCA CTTTGCAAAA TTAT6CTTAT GOCTGGCATG 6060 

GAAATAGAAA TACTCAATTA TGTCTTTGTT GTATTAATQG GGAATATTTT OGACAATGTT 6120 

TCATTATCAA ATTGTOGACA TCATTAATAT ATATTGTAAT GTTGGGAAGA GATCACTATT €180 

TTGAA6CACA GCTTTACAGA TGAGTATCTA TGATACATAT GTATAATAAA TTTTGATCGG 6240 

GTATTAAAAG TATTAGAAGG TGGTTATAAT TGCAGAGTAT TCCATGAATA GTACACT6AC 6300 

ACAGGG6TTT TACTTTGAGG ACCAGTGTAG TCAAGGGAAA ACATGAGTTA AAAAGAAAAG 6360 

CAOQCAAtAT T6CA0TCTTG ATTCT60CAC TTACAGGATA GATAATGCCT GAACTTTAAT 6420 

OACAASATQA TCCAAOCATA AAGGTGCTCT GTGCTTCACA GTGAATCTTT TCCCCATGCA 6480 

GGAGTGTGCT CCCCTACAAA CX5TTAAGACT GATCATTTCA AAAATCTATT AGCTATATCA 6540 

AAAGCCTTAC ATTTTAATAT AGGTTGAACC AAAATTTCAA TTCCAGTAAC TTCTATTGTA 6600 

ACCATTATTT TTGTOTATGT CTTCAAGAAT GTTCATTG6A TTTTTGTTTG TAATAGTAAA 6660 

ATACOGGATA CATTTCACGT GTOCTTCAGT ATTGATTTGG TTGAATATTG GGTCATAATO 6720 

GrrQAOAAGC ATGGACACTA GAGCCAGAAT GCTTGGATAT GAATCCTGGA TCTGTCACTT 6780 

ACTTCTGTGT GACCTTT6AA AGGCTACTTA TTTCCTCTCT TAGCTTTCTC ATTAAAATCA 6840 

ATGAACAATG CCAGCCTCAT GGGGTTGTTG AATGATTAAA TTAGTTAATA TACCTAAAGT 6900 

ACATAGAAGA CTGCCTGCAC ATAGTAAAAG AATTATAAGT GTGAGGTAGT TGGTAAAAT7 6960 

ATOTAOTTOS ATATACTACC 6AACAATATC TAATCTCTTT TTAGGQAAAT AAAOTTTGTO 7020 
CATATATATA ATCCCGAAAC AT6 

Seq ID NO: 527 Protein sequence 
Protein Accession #: NP_077741.l 

I 11 21 31 41 51 

I i i I ] I 

HAAAGPRRSV RGAVCLHI*LL TLVIFSRDGE ACKKVILNVP SKLEADKIIG RVNLEECFRS 60 

ADLIRSSDPD PRVUIDGSVY TARAVALSDK KRSFTIWLSD KRKQTQKEVT VU;»EHQKKVS 120 

KY R HTRETVL RRAKHRHAPZ PCSNQENSLG PFPI>FXiCX3IVE SDAAQHYTVP YSISORGVDK 180 

EPLNLFYZER DT6NLFCTRP VDREEYDVFD LIAYASTADG YSADLPLFIiP IRVmaiDNB 240 

PVPTEAIYNP EVLESSRPGT TVGWCATDR DSPDIMHTRL KYSILQQTFR SPGLFSVHPS 300 

TGVITTVSHY LDREWDKYS LIMKVQDMDG QFFGLIGTST CIITVTDSND MAPTPRQNAY 360 

EAFVESEIAFN VEIIAIPIED KDLIMTANWR VNFTZLK03E NGHFKISTDK ETNEGVLSW 420 

KPUNYEBNSQ VNLGIGVNNB APFARDIPRV TALMRALVTV HVRDIiDEGPE CTPAAQYVRZ 480 

KEHLAVGSKX SGYKAYDFEK RNGNQLRYKK LBDPRGMtTZ DEIS6SIITS KILOREVETP 540 

KNELYNITVL AIDKDDRSCT GTLAVNIEDV NDNPPEILQE YWIC3CPKMG YTDILAVDPD 600 

EPVBGAPFYP SLPNTSPEIS RLWSLTKVND TAARLSYQIOI AGPQEYTIPI TVKDRAGQAA 660 

TKLLRVNLCE CTHPTQCRAT SRSTGVIIX5K WAILAXUiGI ALLPSVLLTL VOGVFGATKG 720 

KRFPEDLAQQ NLIISNTEAP GDDSVCSANO FKXQTTNN8S QGF0GTMGS6 KKNGGQBTIB 780 
HMKGGNQTZ£ SCRSkBBSBT LDSCRGGBTB VCNCRYTYSB HHSFTQFRLG EBSIRQBIG 

Seq ID NO: 528 DNA sequence 

Nucleic Acid Accession §t NM_001941.2 

Coding sequence t 64.. 2754 

1 11 21 31 41 51 

I I I I 1 I 

GGOUSGTCTC 6CTCTC66CA CCCTCCOSGC 6CX0G0GTTC T0CTG6CCCT GCXXXaSCATC 60 

CGGATG6CC6 C0GCTGG6CC OCSGOSCTCC GTQGGGGGAG OOGTCTGCCT 6CATCT8CTG 120 

CTGACCCTCG TGATCTTCAO T0GTGAT6GT GAAGCCTGCA AAAAGGTGAT ACTTAAIGTA 180 

CXrrrCTAAAC TAGAGGCAGA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCAGG 240 

TCTGCAGACC TCATCCGGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGQTCAGTG 300 

TACACAGCCA GG6CTGTT6C GCTGTCTGAT AA6AAAA6AT CATTTACC3^ ATGGCTTTCT 360 

GACAAAAGGA AACAGACACA GAAA6AGGTT ACTGT6CTGC TAGAACATCA 6AA6AAGGTA 420 

TOGAAGACAA GACACACTAG AGAAACIGTT CTCAGGCGTG CCAAGAGGAG ATGGGCACCT 460 

ATTCCTTGCT CTATGCAAGA GAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAAGTT 540 

GAATCTGATO CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACG TGGAGTTGAT 600 

AAA6AAOCTT TAAATTTGTT TTATATAGAA AGAGACACTO GAAATCTATT TTGCACTCXS6 660 

CCT G T G GATC GTGAAGAATA TGATeTTTTT GATTTOATTO CTTATGCJGIC AACTGCAGAT 720 

GGATATTCaG CAGATCTGCC CCTCX?CACTA CCCATCAGGG TACaCGATta AAATGACAAC 780 

CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 840 

ACTACAGTGG GGGTG6TTTG TGCCACAGAC AGAGATGAAC CGGACACAAT GCATACXSCGC 900 

CT6AAATACA GCATTTTGCA GCA6ACACCA A6GTCACCT6 GQCTCTrrTC TGTGCATCCC 960 
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AGCftCAGGOG TAATCftCCAC AGTCTCTC A T TATTTGGACft GAgAG GlWf AGAGAAGXAC 1020 

TCMTGATAA TGAAAGTACft AGACAT6BAT iM M Jt M jri' m TTGGATT6AT PSSGOiCXTCk 1080 

ACTTGTATCA TAACAGTAAC AGATTCAAAT GATAATGCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAGGA AAATGOVTTC AATGTGGAAA TCTTACGAAT ACCTATAGAA 1200 

GAIAAGGATT TAATTAACAC T6CCAATT6G AGA6TCAATT TTACCATTTT AAAG6GAAAT 1260 

GAAAATGGAC ATTTCAAAAT CACCACAGAC AAAGAAACTA AT6AAG6TGT TCTTTCTGTT 1320 

GTAAAGCCAC TGAATTATGA A)GAAAACX33T CAAGTGAACC TGGAAATTGO AGTAAACAA? 1380 

GAA606CCAT TTGCTAGAGA TATTCCCAGA GTQACA6CCT TGAACAQAQC CTTCGTTACA 1440 

GTTCATGTGA GGQATCTGGA TGAGGGQCCT GAATGCACTC CTGCAGCCCA ATATGTGCGG 1500 

ATTAAAGAAA ACTTAGCAGT GGGGTCAAAG ATCAA066CT ATAAGGCATA TGACCCOGAA 1560 

AAIAGAAATG GCftATGGTTT AAGGTACAAA AAATT6CATG ATCCTAAAG6 TTGGATCACC 1620 

ATTGAT6AAA TTTCA6GGTC AATCATAACT TCCAAAATCC TG6A7AGGGA GQTTGAAACT 1660 

CCCAAAAATG AGTPGTATAA TATTAC3^GTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TTGCTGTGAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATGf GGGTATACCX3 ACATTTTAGC TGTTGATCCT 1860 

GATGI^AOCTa TCCA7GGAGC TCCATTTTAT TTCAGTTTGC CCAATACTTC TOCMSAAATC 1920 

AGTA6ACTGT 6GAGCCTCAC CAAAGTTAAT 6ATACA6CTG CC0GTCT7TC ATATCAGAAA 1980 

AATGCTGGAT TTCAAGAATA TACCATTCCT ATTACTGTAA AAGACAGGGC OGGCCAAGCT 2040 

GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCGTGCG 2100 

ACTTCAA6GA GTACAGGAGT AATACTTGGA AAATGG6CAA TCCTT6CAAT ATTACTGGGT 2160 

ATA6CACTGC TCTTTTCTGT ATT6CTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

GGGAAACGTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAA6CA 2280 

CCTGGAGACG ATAGAGTGTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTGTGGTAC TATGGGATCA 6GAATGAAAA ATGGAGGGCA GGAAACCATT 2400 

GAAATGATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGGGGGCTGG GCATCATCAT 2460 

ACCCT6GACT CCTGCA6GG0 AGOACACAOG GAGGTGGACA ACTGCA6ATA CACTTACTCX3 2520 

GAGT6GCACA GTTTTACTCA ACCCCGTCTC GGTGAAAAAT . TGCATOGATG TAATCAGAAT 2580 

GAAGACCGCA TGCCATCCCA AGATTATGTC CTCACTTATA ACTATGAGGG AAGAGGATCT 2640 

CCAGCTGGTT CTGTGGGCTG CTGCAGTGAA AAGCAG6AAQ AAGAT6GCCT TGACTTTTTA 2700 

AATAATTTGG AACCCAAATT TATTACATTA GCAGAAC3CAT QCACAAAGAO ATAATGTCAC 2760 

AGTGCTACAA TTAGGTCTTT GTCAGACATT CTGGAGGTTT CCAAAAATAA TATrGTAAAG 2820 

TTCAATTTCA ACATGTATGT ATATGATQAT TTTTTTCTCA ATTTTGAATT ATGCTACTCA 2880 

CCAATTTATA TTTTTAAAGC CAGTTGrPGC TTATCTTTTC CAAAAAGTGA AAAATGTTAA 2940 

AACAGACAAC TGGTAAATCT CAAACTCCAG CACTGGAATT AAGGTCTCTA AAGCATCTGC 3000 

TCTTTTTTTT TTTTAOGGAT ATTTTAGTAA TAAATATGCT GGATAAATAT TAGTCCAACA 3060 

ATAGCTAAGT TATGCTAATA TCACATTATT ATGTATTCAC TTTAAGTGAT AOTTAAAAA 3120 

ATAAACAAGA AATATTGAGT ATCACTATGT GAAQAAAGTT TTGGAAAAGA AACAATGAAG 3180 

ACT6AATTAA ATTAAAAATQ TTOCAGCTGA TAAA6AATT0 GGACTCAOOC CXACTGCACT 3240 

ACCAAATTCA TTTQACTTTO GAGGCAAAAT 6TGTTGAAGT GCCCTAT6AA GXAGCAATTT 3300 

TCTATAGGAA TATAGTTGGA AATAAATGTG TGTGTGTATA TTATTATTAA TCAATGCAAT 3360 

ATTTAAAATG AAATGAGAAC AAAGAGGAAA ATGGTAAAAA CTTGAAATGA GGCTGGGGTA 3420 

TAGTTTGTCC TACAATAGAA AAAAGAGAGA GCTTCXn-AGG CCTG6QCTCT TAAATGCTGC 3480 

ATTATAACTS AOTCTATGAO OAAATAGTTC CTGTCCAATT TGTGTAATTT GrTTAAAATT 3540 

GTAAATAAAT TAAACTTTTC ' i X SG TlTCTC l ' GGGAAGGAAA TAG6GAAT0C AATGGAACAO 3600 

TAGCTTTGCT TTGCAGTCT6 TTTCAAGATT TCTGCATOCA CAA6TTAGTA 6CAAACTGGG 3660 

GAATACTOGC TGCAGCTGGG GTTOCCTGCT TTTTGGTAGC AAGGGTCCAG AGATGAGGTQ 3720 

TTTTTTTOGG G6AGCTAATA ACAAAAACAT TTTAAAACTT ACCTTTACTG AAGTTAAATC 3780 

CTCXA7TGCT GTTTCTATTC TCTCTTATAO TGAOCAACAT CTTTTTAATT TAGATCCAAA 3640 

TAACCATGTC CTCCTAGAST TTAQAGGCTA 6RGQGACCT0 AGGOGAGQAT CTTACTGAAA 3900 

GCaCCCTGGG GAGATTGATT GTCCTTAAAC CTAAGCCCCA CAAACTTGAC ACCTGATCAG 3960 

GTCTGGGAGC TACAAAATTT CATTTTTCTC CTCACTGCCC TTCTTCTGAG TGGCATTGGC 4020 

CTGAATCAAG GAAAGCCAGG CCTIGTGGGC CCCCTTCTTT CGGCTTTCTO CTAAAGCAAC 4080 

ACCTCCAGCA GAGATTCOCT TAAGTGACTC CAQOmTCC ACCATCCTTC A00GTGAA7T 4140 

AATTTTTAAT CAGTTTQCTT TCTCCAGAGA AATTTTAAAA TAA7AGAA6A AATAGAAATT 4200 

TTGAATGTAT AAAA6AAAAA GATCAAGTTG TOlTTTTAGA ACAGAGGGAA CTTTGGGAGA 4260 

AAGCAGCCCA AGTAGGTTAT TTGTACACTrC AGAG6GCAAC AG6AAGATGC AQGCCTTCAA 4320 

GG6CAAGGA0 AGGCCAOUUS GAATATGGGT GGGAGTAAAA GCAACATCX3T CTGCTTCATA 4380 

CTTTTTCCTA GGCTT66CAC TGCCTTTTCC TTTCTGA56C CAAIGQCAAC TQCCATTTGA 4440 

GTCCGGTGAG GQATCA6CCA ACCTCTTCTC TATGGCTCAC CTTATTTGOA GT6A6AAATC 4500 

AA6QAGACA3 AGCTGACTOC ATGATGAGTC TGAAGGCATT TGCAG6ATGA GCCTGAACTG 4560 

GTTGTGCAGA ACAAACAAGG CATTCATGGG AATTGTTGTA TTCCTTCTGC AOCCCTCCTT 4620 

CTGGGCACrA AGAAGGTCTA TGAATTAAAT GCCTATCTAA AATTCTGATT TATTCCTACA 4680 

rmt:im " n ' tctaatttga ccctaaaatc tatgtqtttt agacttaqac tttttattgc 4740 

CCCCCCCXX3C ■ irmTmxJ AGACQGAOTC TOSCICTGAC QCACAOGCTG GAGTGCAGTG 4600 

GCTCCX3ATCT CTGCTCACTG AAA6CTC06C CTCCCG6GTT CATGCCATTC TOCTGCCTCA 4860 

GCCTCCTGAG TAQCTGGGAC TACAGGOSCC CACCACCAOG CCOGGCTAAT TTTTTGTATT 4920 

TTTAATA6AO A0QG6GTTTC ACTGT G TT A G CCAGGAT6GT CTCGATCTCC ItSACCTCXmS 4980 

ATCCGCC7GC CT0G6CCTCC CAAA6TSCT6 GGATTACAOO CATGAOOCAC OGCTCCCGOC 5040 

CTTGTTTTCC GTTTAAAGTC GTCTTCTrTT AATOTAATCA TTTTGAACAT GTGTGAAAGT 5100 

TGATCATAOG AATTGQATCA ATCTTGAAAT ACTCAACX»A AAGACAGTCG AGAAGCCAGG 5160 

GGGAGAAAGA ACTCAGGGCA CAAAATATTG GTCTGAGAAT GGAATTCTCT CTAAGCCTAG 5220 

TT6C7GAAAT TTCCT G CT GT AACCA6AAGC CAGnTTATC TAAOSGCTAC TGAAACACCC 5280 

ACTGIGTTTT GCTCACTCOC TCACTCA006 ATCAAAAOCT GCTACCTOOC CAAGACTTTA 5340 

CXAGTGCOGA TAAACTTTCT CAAAGA6CAA CCAQTATCAC TTCCCTGTTT ATAAAAGCTC 5400 

TAACCATCTC TTTOTTCTTT QAACATOCTG AAAACCAOCT GGTCTQCATG TATGCC06AA 5460 

TTTGTAATTC TTTTCTCTCA AATGAAAATT TAATTTTAGG GATTCATTTC TATATTTTCA 5520 

CATATGTAGT ATTATTATTT CCTTATAT6T GTAAGGTGAA ATTTATGGTA TTTGAGTGTQ 5580 

CAAGAAAATA TATTTTTAAA GCTTTCATTT TTC CO CCACT GAATGATTTA GAATTTTTIA 5640 

TGTAAATATA CAGAATGTTT T7TCTTACTT TTATAAGGAA GCAOCTGTCT AAAAT6CAGT 5700 

GGGGTTTGTT TTGCAATGIT TTAAACAGAG TTTTAGTATT 6CTATTAAAA GAAGTTACTT 5760 

TGCTTTTAAA GAAACTTGGC TGCTTAAAAT AAGCAAAAAT T6GATGCATA AAGTAATATT 5820 

TACRGATGT6 GGGA6ATGTA ATAAAACAAT ATTAACTTGG ITlCrilfm TTGCrGTATT 5880 

TftGAGATTAA ATAATTCTAA QATGATChCT TTGCAAAATT AT6CTTA1GG CTGGCATGGA 5940 

AAXA6AAATA CTCAATTA7XS TCTmGTTGT ATTAATGGOO AATATTITGG ACAATGTT7C 6000 

ATTATCAAAT T6T0GACATC ATTAATATAT ATTGTAATGT T6GGAAGAGA TCACtATTTT 6060 

GAAGC31CAGC TTTACAGATO AGTATCTATG ATACATATGT AXAATAAATT TTGATOSGGT 6120 

ATXAAAAGTA TTAGAAGGTG GTTATAATTG CAGAGTA7TC CATGAATAGT ACACTGACAC 6180 
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AGGGGTTTTA CTTTGAGGAC CAfiTGTAGTC AAGGGAAAAC 
GGCAATATTG CAGTCTTGAT TCTGCCACTT ACAGGATAGA 
CAAGATGATC CAACCATAAA 66T6CTCTGT GCTTCACAGT 
AGTGT G CTCC CCTACAAACO TTAAGACTGA TCATTTCAAA 
AQCCTTACAT TTTAATATAG GTTGAACCAA AATTTCAATT 
CATTATTTTT GTGTATGTCT TCAAGAATGT TCATTGGATT 
ACOGGATACA TTTCACGTGT CCTTCAGTAT TGATTTGGTT 
TGAGAAGCAT GGACACTAGA GCCAGAATGC TTGGATATGA 
TTCTGTGTGA CCTTTGAAAG GCTACTTATT TOCTCTCTTA 
GAACAATGCC AGCCTCATGG GGTTGTTGAA TGATTAAATT 
ATAGAACACr GCCTGCACAT AGTAAAAGAA TTATAAOIGT 
GTAOTTGGAT ATACTACX3GA ACAATATCTA ATCTCTTTTT 
TATATATAAT CCCGAAACAT G 

Seq ID NO: 529 Protein sequence 
Protein Accession «: 1IP_001932.1 



1 
1 

KAAAGPRRSV 
ADLIRSSDPD 
KTRHTRBTVL 
EPLNLFYIER 
PVPTEAIYNP 
TGVITTVSHY 
EAFVEQIAm 
KPUTYEENRQ 
KHNLAVGSXI 
KHELTOIITVL 
EPVHGAPpyP 
TKLIiRVNLCE 
KRPPEOLAQQ 
MMKGGKQTIiE 
DRMPSOOYVXt 



11 

I 

RGAVCLHIiLL 
FRVLUDOSVy 
RSAKRRMAPZ 
DTQJLFCTRP 
EVLESSRPGT 
LDREWDKYS 
VEILRIPIED 
VNLEIGVNNE 
NGYKAYDPEN 
AIDKDDRSCT 
SLPNTSPSIS 
CTHPTQCIiAT 
HLIISNT&AP 



21 

I 

TIiVZFSRDGE 
TARAVALSDK 
PCSMQENSLG 
VDREEYDVPD 
TVGWCATDR 
LI^aCVQDMDG 
KDLINTANWR 
APPARDIPRV 

GTLAVNIEDV 
RLWSLTKVND 
SRSTGVILGR 
CTDRVCSANG 



TYNYBSRGSP AGSVGOCSEK 



31 

I 

ACKKVZU7VP 
KRSPTZWLSD 
pFFLFIiQQVB 
LZAYASTADG 
DBPDTKHTRL 
QFFGLIGTST 
VNFTIZiKGNB 
TALNRALVTV 
LHDPKGWITI 
NDKPPEILQE 
TAARLSYQKN 
HAILAILLGX 
FMrOTTNMSS 
VDNCRYTYSB 
QEBD6LDFUI 



Seq ZD NO I 530 DEIA sequence 

Nucleic Acid Accession «t NM_0165B3.2 

Coding sequence : 72 . . S43 



GGAGTGGGGG 
TAAGAOCAAA 
CCATGGCCCA 
ATOCAGCCCT 
ATGGCCroCT 
TGAAGCCTGG 
CAGTGATTCC 
AAC7TGGCCT 
TAAA6CTCCA 
TGGACATCAC 
TTGGTGACTG 
CCCTCOCXAT 
AGTTGGTTCA 
CCCTGGTGCA 
AAGCXTTCCA 
GCCCATGTGC 
TCCCACCAG6 
AAAAAAAAAA 



11 
I 

AGA6AGAGGA 
GATGTTTCAA 
G TTTGGAG GC 
GCCCTTGAGT 
GTCTGGGGGC 
AGGAGGTACT 
TGGCCTGAAC 
TGTGCAGAGC 
A6TQAATA08 
TGCA6AAATC 
CACCX3^TTCC 
TCAAOGTCTT 
GGGCR AOGTG 
TGACATTGTT 
GGAAGGG6CT 
TGGAAGATGA 
CGTGTGTAAC 
AAAAAAAAAA 



21 

\ 

GACCAOSACA 
ACTGGGGGCC 
CTGCCCGTGC 
COOiCAaGTC 
CTGTTGGGCA 
TCTGGTGGCC 
AACATCATTG 
CCTGAT66CC 
CXXX TC GTCXS 
TTAGCTGTGA 
CCTG6AAGCC 
CTGGACAGCC 
TGCCCTCTGG 
AACATGCTGA 
GGCCTCT6CT 
CACAGTTGCC 
ATCCCAT0T6 
AAAAAAAAA 



31 
I 

GCTGCTGAGA 
TCATTGTCTT 
COCTGGACCA 
TTOCAOQAAO 
TTCraOAAAA 
TOCTTGGGGG 
ACATAAAGGT 
ACOGTCTCTA 
GITGCAAOTCT 
GAGATAAGCA 
TGCAAATTTC 
TCACAGGGAT 
TCAATGAGGT 
TCCAOGGACT 
GAGCTGCTTC 
TTCTCTCOGA 
OCTCACCTAA 



ATGAGTTAAA 
TAATGCCTGA 
GAATCTTTTC 
AATCTAT1A6 
CCAGTAACTT 
TTTGTTTGTA 
GAATATTGGG 
A TCCTGGA TC 
GCTTTCTCAT 
AGTTAATATA 
GAGGTAGTTX5 
AG66AAATAA 



41 

I 

SKZiEADKIZG 
KRKQTQKEVT 

SDAAC^nrrvp 

YSADLPZ.PLP 
KYSIWJQTPR 
CIZTVTOSND 
NGBFKI5TDK 
HVRDUJBGPB 
DEI5GSZ1TS 
YWICKPKMG 
AGFQEYTZPZ 
AXiLPSVIiLTL 
QGP0GTTCGS6 
WHSPTQPRLG 
NLEPKFITLA 



41 

I 

CCTCTAAGAA 
CTACGGGCTG 
GACCCTGCCC 
CTTSACAAAT 
CCTTCOGCTC 
ACTGCTTGGA 
CACTGACCXX: 
TGTCAOCATC 
GTTGAGOCTO 
GGAGAGGATC 
TCTGCTTGAT 
CTTGAATAAA 
TCTCAGAGGC 
ACAGTTTGTC 
CCAGTGCTCA 
GQAAOCTOCX: 
TAAAATGGCT 



AA6AAAAGCA 
ACTTTAAIGA 
CCCAT6CAGG 
CT ATATC AAA 
CrATTGTAAC 
ATAGTAAAAT 
TCATAATGGT 
TGTCACTTAC 
TAAAATCAAT 
C3CTAAAGTAC 
GTAAAATTAT 
AOTTT6T6CA 



51 
1 

RVNLEBCFRS 
VZiLEBQKKVS 
ySZSGRGVDX 
ZRVEDENDNH 
SPGLFSVHPS 
NAPTPRQ17AY 
EINBGVLSW 
CTPAAQYVRI 
KZLDREVBTP 
YTDILAVDPD 
TVKDSA6QAA 
VOGVFGATXG 
KKNGGQETZE 



BACTKR 



51 

I 

GTCCAGATAC 
TTAGCCCAGA 
TT6AATGTGA 
GCCCTCAGCA 
CTGGACATCC 
AAAGTGACGT 
CAGCTGCTGG 
CCTCTOQGCA 
OCTOTGAAGC 
CACCT6GTCC 
GGACTTGGCC 
GTCCTGCCTG 
TTGGACATCA 
ATCAAOGTCr 
GAOATGGCro 
CCCTCTOCTT 
CTTCTTCTGC 



Seq ZD NO: 531 Protein sequence 
Protein Accession ft: NP 057667.1 



51 



1 11 21 31 41 

I I I I I I 

MFQTGGIiZVF YGLLAOTMAQ PGGIiPVPIiDQ TLPLNVNPAL PLSPTGLAGS LTKALSNQIiI. 
SGGLL61LQ3 LPLLDZLKPG GGTSGGLLGG UjGKVTSVIP GLHKIIDXKV TDPQLLEXiGL 
VQSPOGKRLY VTZPLGIKZiQ VNTPLVGASL LRLAVKLDIT ABILAVRDXQ ERIHLVLGDC 
IB5P6SLQZS LLDGLGPItPZ QGIiU)SLT6Z LNKVLPELVQ GNVCPLVMEV LRGLDITLVH 
DZVNKblHtai QFVIKV 



seq ZD NO: 532 DNA sequence 

Nucleic Acid Accession ft: NN_004363.1 

ceding seq^enee: 115.. 2223 " 



1 
I 

CTCAGGGCAG 
TCCTGGAACT 
TCTCOCTOGG 
TCACTTCIAA 
TTCAAT6T0G 
TTT6GCTACA 
GTAATAGGAA 
CCCAATGCAT 



11 
I 

AGGGAGGAAG 
CAAGCTCTTC 
CCOCTCCCCA 
CCITCtOGAA 
CAGAG6G6AA 
GCTGGTACAA 
CTCAACAAQC 
COCTQCTGAT 



21 

I 

GACAGCAGAC 
TCCACAGAG6 
CAGATGGTGC 
GCG8CGCACC 
OGAOGTGCIT 
AGGTGAAAGA 
TACCCCAGGG 
CCAGAACATC 



31 
I 

CAGACAGTCA 
AGGACAGA6C 
ATCCCCTGGC 
ACTQGCAAOC 
CXACTTGTOC 
GTGGAK^CA 
CCCGCATACA 
ATCCAGAA7G 



41 

I 

CAGCAGCCTT 
AGACAGCAGA 
AGAGGCTCC7 
TCACTATTGA 
ACAATCTGCC 
ACCGTCAAAT 
GTGGTCGAGA 
AGACAGGATT 



51 

1 

GACAAAAOGT 
GACCATGGAG 
GCTCACAGCC 
ATCCA0SG06 
CCAGCATCTT 
TATAGGATAT 
GATAATATAC 
CTACACCCTA 



6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 



PCTAJS02/12476 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 



389 



wo 02/086443 

CAOGTOklAA ASTCAGATCT TQTGAATGAA GAAGCAACTG GGCAGTTOCB G6TA1ACC06 540 

GAGCTGCCCA A6CCCTCCAT CTOCAGOUIC AACTCCMU^ C0STG6AG6A CAAGGMGCT 600 

GTGGCCTTCA CCTGTGAACC TGAQACTCAG GAOGCAACCT ACCTGTGGTG GGTAAACAAT 660 

CAGAGCCrcC CGGTCAGTCC CAGGCTGCAG CTGTCCAATG GCAACAGGAC CCTCACTCTA 720 

TTCAAT6TCA CAAGAAATGA CACAGCAAGC TACAAA7GTG AAACCCAGAA CCCAGTGAGT 780 

GOCAOGOSO^ 6IGATTCAGT CATCCTGAAT GTCCTCTATG 6C00QGAT6C CCCCACCATT 840 

TCCCCTCTAA AdCATCTTA CAGATCAGGO GAAAATCTGA ACCTCTC C P Q CCACGCAGCC 900 

TCTAACCCAC CTGCACAGTA CTCTTGGTTT GTCAATGGGA CTTTCCAGCA ATCCACCCAA 960 

GAGCrcrTTA TCCCCAACAT CACTGT6AAT AATAGT6GAT CCTATACGTG CX^AGCCCAT 1020 

AACTCAGACA CTG6CCTCAA TAGGACCACA GTCA0GA06A TCACAGTCTA T6CAGAGCCA 1080 

COCAAAGOCT TCATCACCAG CAACAACTCC AACCCCGTG6 AG6ATGAGGA T6CTGTA6CC 1X40 

TTAACCTGT6 AACCT6AGAT TCAGAACACA ACCTACCTGT GGTGGGTAAA TAATCAGAGC 1200 

CTCOOGGTCA GTCCCAOGCT GCAGCTGTCC AATGACAACA GGACCCTCAC TCTACTCAGT 1260 

GTCACAAGGA ATGATGTAGG ACCCTATGAO TGTGGAATCC AGAACGAATT AAGT6TTGAC 1320 

CACAGCGACC CAGTCATCCT GAATGTCCTC TATG6CCCAG AOGACCCCAC CATTTCCCCC 1300 

TCATACACCT ATTAOOGTCC AGGpGTGAAC CTCAGCCTCT GCT6CCATGC AGCCTCTAAC 1440 

CCACCT6CAC AGTATTCTTG GCTGATTGAT GGGAACATCC AGCAACACAC ACAAGAGCTC 1500 

TTTATCTCCA ACATCACTGA GAAGAACAGC GGACTCTATA CCTGCTAGGC CAATAACTCA 1560 

GCCAGTGGCC ACAGCAGGAC TACAGTCAAG ACAATCACAG TCTCTGCGGA GCTGCCCAAG 1620 

CCXrrCCATCT CCAGCAACAA CrCCAAACCC GTGGAG6ACA AGGATGCTGT GGOCTTCACC 1680 

TGTGAACCTG AGGCTCAGAA CACAACXTTAC CTGT6GTGGG TAAATGOTCA GAGGCTGCCA 1740 

GTCAGTCCCA GGCTGCAGCT GTCCAATGGC AACAGGACCC TCACTCTATT CAATGTCACA 1800 

AGAAATX3ACG CAAGAGCCTA TGTATGTGGA ATCCAGAACT CAGTGAGTGC AAACCGCAGT 1860 

GACCCAGTCA CCCTGGATGT CCTCTATGGG CCGGACACCC CCATCATTTC CCCCCCAGAC 1920 

TCX5TCTTACC TTT0C5GGAGC GAACCTCAAC CTCTCCTGCC ACTCGGCCTC TAA CCCAT CC 1980 

COGCAGTATT CTTGOCGTAT CAATGGGATA CXX3CAGCAAC ACACACAAGT TCTCTTTATC 2040 

GCCAAAATCA OGCCAAATAA TAACGGGACC TATGCCTGIT TTGTCTCTAA CTTGGCTACT 2100 

GGCCGCAATA ATTCCATAGT CAAGAGCATC ACAGTCTCTG CATCTGGAAC TTCtCCTGGT 2160 

CTCTCAGCTG GGGCC3VCPGT OGGCATCATQ ATTGGA6TGC TGGTTG GGGT TGCTCTGATA 2220 

TAGCAGCCCT GGTOTAGTTT CTTCATTTCA GGAAQACTGA CAGTTGTTTT GCTTCTTCCT 2280 

TAAAGCATTT GCAACAGCTA CAGTCTAAAA TTl jCn 'C m' ACCAAGGATA TTTACAGAAA 2340 

AGACTCTGAC CAGAGATCGA GACCATCCTA GCCAACAT06 TGAAACCCCA TCTCTACTAA 2400 

AAAXACAAAA ATGA0CT660 m W l MUO i 06CA0CT6TA GTCCCAGTTA CT06GGAGGC 2460 

TGAGGCA6GA GAAT06CTTG AACCCGG6AG GTG(»GArtG GAGTGAGCOC AGATCGCACC 2520 

ACTQCACTCC AGTCTGGCAA CAGAGC3VAGA CTCCATCTCA AAAAGAAAAG AAAAGAAGAC 2580 

TCTGACCTGT ACTCTTGAAT ACAAGTTTCT GATACCACTG GACTGTCTGA GAATTTCCAA 2640 

AACTTTAATG AACTAACTGA CAGCTTCATG AAACTGTCXA CCAAGATCAA GCAGAGAAAA 2700 

TAATTAATTT CATOOGACTA AATGAACTAA TGA6GATTGC TGATTCTTTA AATGTCTIGT 2760 

TTCCCAGATT TCAGGAAACT rmTAVm ' TAAGCTATCC ACTCTTACAG CAATTTGATA 2820 

AAATATACTT TTGTGAACAA AAATTGAGAC ATTTACATTT TCTCCCTATO TGGTC6CTCC 2880 

AGACTTGGGA AACTATTCAT 6AATATTTAT ATTGTATGGT AATATAGTTA TT6CACAAGT 2940 
TCAATAAAAA TCIGCTCTTT GTATAACA6A AAAA 

Seq ZD HO: 533 Protein sequence 
Protein Acceesion fi: NP_0043S4.1 

1 11 21 31 41 51 

] I i I I I 

MESPSAPPBR MCZFHQRLLL TASLLTFWNP PTTAKLTZSS TPFNVAE6KB VLLLVENIiPQ 60 

HLPGYSWYKG ERVDGMRQII GYVIGTQQAT PGPAYSGRBI lYPNASLLIQ NIIQNDTGPy 120 

TLHVIKSDLV NEEATGQFRV YPBLPKPSIS SNNSKPVEDK DAVAPTCEPE TQDATYLWWV 180 

NKQSLPVSPR LQLSNGNRTL TLFNVTRNDT ASYKCETQNP VSARRSDSVI LMVLYGPDAP 240 

TZSPUarSYR SGENUILSCH AASNPPAQYS frPVNQTFQQS TQELFZPHZT VNNSGSYTOQ 300 

AHNSDTOLNR TTVTTITVyA BPPKPPITSM NSNPVEDEDA VALTCEPBIQ HTTYLWHWII 360 

QSLFVSPRIiQ LSNCMRTLTIi LSVTRNDVGP YEOGIQNELS VDHSDPVILM VLYGPDDPTI 420 

SPSYTYYRPG VMLSLSCHAA SNPPAQYSWL XDQNIQQHTO ELPISNITEK NSGLYTCQAH 480 

NSA5GESRTT VKTITVSAEL PKPSISSNNS KFVEDKDAVA FTCEPEAQNT TYLHHVNGQS 540 

LPVSPRLQLS NtaiRTLTIiFK VTRKDARAYV OGZQHSVSAN RSDPVTU3VX. Y6PDTPIISP 600 

FDSSYtiSQAN U9LSCHSASH PSPQYSWRIK GZPQQBTQVL PZAKITBQIH GTYACPVSHL 660 
ATGRNNSIVX SITVSASGTS PGLSAGATVG IKI6VLVGVA LI 

Seq ID KOi 534 DNA sequence 

Nucleic Acid Accession St 11M_006952.1 

Coding sequence t 11 . . 7 93 

1 U 21 31 41 51 

I I I 1 I I 

AATCCOQACA ATGGCGAAAO ACAACTCAAC TOTTCXITTGC TTCCAGGGCC T6CTGATTTT 60 

TGGAAATGTO ATTATTGOTT GTT0C66CAT TGCOCTGACT GOGGAOTGCA TCTTCTTTGT 120 

ATCTGAOCAA CACAGCCTCT ACCCACTGCT TGAAGCCACC GAC3VA0GATG ACATCTATGG 180 

GGCTGCCTGG ATC5GGCATAT rTGTGGGCAT CTGCCTCTTC TGCCTGTCTO TTCTAGGCAT 240 

TGTAGGCATC ATGAAGTCCA GCAGGAAAAT TCTTCTGGCG TATTTCATTC TGAT6TTTAT 300 

A6TATATGGC TTTGAAGTGQ CATCTTGTAT CAGAOCAGCA ACACAAOGAG ACTTTTTCAC 360 

ACCCAAOCTC TTCCTGAAGC AGATOCTAGA GAGGTACCAA AACAACAGCC CTCCAAACAA 420 

TGATGACXaWS TOQAAAAACA ATGGAGTCAC CAAAACCTGG GACAGGCTCA TGCTCCAGGA 480 

CAATTGCTGT GGOGTAAATG GTCCATCAGA CTGGCAAAAA TACACATCTG CCTTCCGGAC 540 

TGAGAATAAT GATGCTGACT ATOCCTGGCC TOSTCAATGC TGTGTTATGA ACAATCTTAA 600 

AGAACCTCTC AACCTGGAGG CTTGTAAACT AGGOBIGCCT GGTTTTTATC ACAATCA6G6 660 

CTGCTATGAA CIGATCTCTG GTCCAATGAA C0GACA06CC TGGGGGGTT6 CCTGGTTTGG 720 

ATTT6CXATT CTCT G CTGQA CTTTTTGGGT TCTCCTGGGT ACCATGTTCT ACTGGAOCAG 780 
AATTGAATAT TAAGAA 

Seq ID NO: 535 Protein sequence 
Protein Accession »i £rp_0088B3.1 

1 11 21 31 41 51 

I I I i 1 I 



390 



wo 02/086443 

MAKDNSTVRC FQGLI>IF(3IV IIGCOGZALT AEC2FFVa3Q HSLYPLLEAT DNTOIYGAAN 60 

IGIFVGXCLF CLSVU3IVGI KKSSRKILIA YFILNFIVnV FEVASCITAA TQRDFFTFHI. 120 

FLKQMLERVQ NNSPFNKDDQ WKNNGVTKTH DSIMLQDKOC GVNGPSDWQK YTSAFRTENN 180 

OADYPWPRQC CmnnSSPh miEACXUCm GPyENQGCYE LZSGPMKBHA HGVAHFGFAI 240 
LCWTFWVLLG TMFYWSRIEY 

Seq 70 NOi 536 SNA eeque&ce 

Nucleic Acid Accession i: NM_00263B.X 

Coding sequence t 120.. 473 ~ 

1 11 21 31 41 SI 

I I I t I I 

CAAXACAGCT AAGGAATTAT CCC7TGTAAA TACCACAGAC CCGCCCTCGA GCOiGGCCMi 60 

GCTGGACTGC ATAAAGATTQ GTATCSGCCTT AGCTCTTAGC CAAACACCTT CCTGACACCA 120 

TGAGG6CCA6 CAGCTTCTTG ATCGTGGTGG TGTTCCTCAT 0GCTG0GAC6 CTG6TTCTA6 180 

AGGCAGCIGT GACGGGAGTT CCTGTTAAAG GTCAAGACAC TGTCAAAG6C 06TGTTGCAT 240 

TCAATG6ACA AGATCCCGTT AAA6GACAAG TTTCAGITAA AGGTCAAGAT AAAGTCAAAG 300 

CGCAAGAGCC AGTCAAAGGT CCAGTCTCCA CTAAGCCTGG CTCCTGCCCC ATTATCTTGA 360 

TCCGGTGOGC CATGTTGAAT CCCCCTAACC 6CTQCTTGAA AGATACTGAC TGCCCAGGAA 420 

TCAAGAAGTG CTGTGAAGGC TCTTGCGGGA TGGCCTGTTT CGTTCCCCAG TGAAGGGAGC 480 

OGGTCCTTGC TGCACCTGTQ CCGTCCCCAG AGCTACAGGC CCCATCTGGT CCTAAGTCCC 540 

TGCTGCCCTT CCCCTTCCCA CACTGTCCAT TCTTCCTCCC ATTCAGGATG CCCAOSGCTO 600 
GAGCTGOCTC TCTCATCCAC TTTCCAATAA A 

Seq ZD KOs 537 Protein sequence 
Protein Accession it NP_002629.1 

1 11 21 31 ' 41 51 ' 

I i I 1 .1 I 

MRA5SFLIW VFZiZAGTLVL EAAVT6VFVX GQDTVKCaiVP F19GQDPVKGQ VSVXGCJSDKVK 60 

AQEPVKGPV9 TKPGSCPZIIi XRCAMLHPFN RCUOnDCPO IKKCCEGSCQ MACFVPQ 

Seq ID NO: 538 DNA sequence 

laucleic Acid Accession it liM_001793.2 

Coding sequence i 71.. 2560 

1 11 21 31 41 51 

I I I I I I 

AAAGGGGCAA GAGCTGASCS GAACACCGGC CGQGGQTOGC GGCASCTOCT TCACCOCTCT 60 

CTCTGCAGOC ATGGGGCTCC CTOGTGGACC TCTOOOQTCT CTCCTCCTTC TCCAGGTTTG 120 

CTGGCTGCAG T6C6CG6CCT CCQAGCOGTG COGGGCGGTC TTCAGGGAGG CTGAAGTGAC 180 

CTTGGAGGOO GGAGGCGOSQ AGCAGGAGCC CGGCCAGG06 CTGGGGAAAG TATTCATGGG 240 

CT GC CCtOG Q CAAGAGCCAG CTCTGTTTAO CACTGATAAT 6ATGACTTCA CIOTGCQGAA 300 

TCGCBAGACA 6TCCAG6AAA GAAGGTCACT 6AAG6AAAGG AATOCATTGA AGATCTTCCC 360 

ATCCAAACGT ATCTTACGAA GACACAAGAG AGATTOGGTQ GTTGCTCCAA TATCTGTCCC 420 

TGAAAATGGC AAGGGTCCCT TCCCCCAGAO ACTGAATCAG CTCAAGTCTA ATAAAGATAG 480 

AGACACCAAG ATTTTCTACA GCArCACOGO GCCGGGGGCA GACAGCCCCC CTGAGG6TGT 540 

CTTOOCTGIA OAGAAGGAGA CAG6CIGGTT GTTGTTGAAT AAGCCACTGG A008GQAGQA 600 

QATTGCCAAO TA7GAGCTCT Yl'GGCCAOGC TGTGTCA6AG AATGGTGCCT CAOTOQAGGA 660 

CCCCATGAAC ATCTCCATCA TCGTGACCGA CCAGAATGAC CACAAGCCCA AGTTTACCCA 720 

GGACACCTTC CGAGGGAGTG TCTTAGAGGG AGTCCTACCA GGTACTTCTG TGATGCAGGT 780 

GACA6CCA0G GATGAGGATQ ATGOCATCTA CACCTACAAT GGGGTGGTT6 CTTACTCCAT 840 

OCATAOOCAA GAAOCAAAOG ACOCAGAGGA CCTGATOTTC ACCA7TCA0C 6GAGCACAGG 900 

CACCATCA6C OTGATCTCCA GT06CCTGGA COGGGAAAAA GTCCCTGA6T ACACACTGAC 960 

CATCCAGGCC ACAGACATGG ATGGGGAC60 CTCCACCACC ACG6CAGTGG CAGTAOTGGA 1020 

GATCCTTGAT GCCAATGACA ATGCTCCCAT GTTTGACCCC CAGAAGTACG AOOCCCATCT 1080 

GCCTGAGAAT GCAGTGG6CC ATGAGGTGCA GAOGCTaAOO GTCACTGATC TG6AG6CCCC 1140 

CAACTCACCA OOQTOGCGTG CCACCTACCT TATCAT9GGC G6TGACGA0G Q6GACCATTT 1200 

TACCATCACC ACCCACCCTG AGA6CAACCA GGGCAT0CT6 ACAACCAGGA AGGGTTTOGA 1260 

TTTT6AGGCC AAAAACCAGC ACACCCTGTA OGTTGAAGTG ACCAA06AGG CCCCTTTTGT 1320 

GCTGAAGCTC CCAACCTCCA CAGCCACCAT AGTGGTCCAC GTGGAGGATG TGAATGAGGC 1380 

ACCTGTGTTT GTCCCACCCT CCAAAGTGGT TGAGGTCCA6 6AGG6CATGC CCACTGGGGA 1440 

GCCTGTGTOT GTCTACACTQ CAGAAGACCC TGACAAOGAG AATCAAAAGA TCA6CTAC03 1500 

CATOCTGAGA GACCCAGCAO GGTGGCTAGC CATGGAOOCA GACA O TGG G C AG6TCACAGC 1560 

TGTGGGCACC CTOGACCGTG AGGATGAGCA GTTTGTGAGG AACAACATCT ATQAAGTCAT 1620 

GGTCTTGGCC ATGGACAATG GAAGCCCTCC CACCACTGOC AOJOGAACCC TTCTOCTAAC 1680 

ACTGATTGAT GTCAATGACC ATGGCCCAGT CCCTGA6CCC CGTCAGATCA CCATCTGCAA 1740 

CCAAAGOOCT GTGOGCCAOa TGCTGAACAT CAOGQACAAG GACCT G TCT C CCCACACCTC 1800 

CCCTTTOCAO GCCCA6CTCA CAGATGACTC AGACATCTAC TG6A0G6CAG A06TCAACGA 1860 

GGAAOOTGAC ACAGTGGTCT TGTCCCTGAA GAAGTTCCTO AAGCAGGATA CATATGA06T 1920 

GCACCTTTCT CTGTCTGACC ATGGCAACAA AGAGCAGCTG AC6GTGATCA GGGCCACTGT 1980 

GTG06ACIGC CATGGCCATG TOGAAACCTG CCCIGGAOCC TGGAAGGGAO GTTTGATCCT 2040 

CUL ' lVrB CT U QGGGCTBTOC TGQCTCTQCT GTTOCTOCro CTGGTG C TOC TTTTGTTGGT 2100 

GAGAAAGAAG 06GAAGATCA AG6ACCCCCT CCTACTCCCA GAAGATGACA CCOSTGACAA 2160 

CGTCTTCTAC 7ATG0CGAAG AGGGGGGTGG C6AAGA6GAC CA6GACTATG ACATCACCCA 2220 

GCTCCACCXSA GGTCTQGAGG CCAGGCOGGA GGTSGlTCrc CGCRAltSAOG IIGGCACCAAC 2280 

CATCATOCOG ACACCCATGT ACOSTGCTOO GCCAGOCAAC CCA6ATGAAA TG86CAACTT 2340 

TATAATTQAG AACCI6AA06 066CTAACAC AQAOCCCACA GCCOOGCCCT AG6ACAC0CT 2400 

CnWi XS T l X; GACTATGAGG GCAOOGGCTC OGACGCCGOG TCCCTGAGCT GOCTCACCTC 2460 

CTCC8CCTCC GACCAAOACC AAGATTACGA TTATCIGAAC GAGTGGGGCA GCOGCTTCAA 2520 

GAAGCTX36CA GACATGTAOG GTGG0GGG6A GGACX3ACTAG 60GGCCTGCC TGCAGG6CTG 2580 

6GGA0CAAAC GTCAG6CCAC AGAOCATCTC CAAGGGGTCT CAGTrCOOCC TTCAGCTGAO 2640 

GACTTCQGAG CTTGTCAGGA AGTGGCOGTA GCAAC7TGGC GGAGACAGGC TAT6AGTCTG 27O0 

ACGTTAGAGT GGTTGCTTCC T7AGCCTTTC AGGATGGAGG AATGTGGGCA GTTTGACTTC 2760 

AGCACTGAAA ACCTCTCCAC CTGGG CgGG OTTGCCTCAC AGGC CAAGTT TCCAGAAGCC 2820 

TCTTACCTGC OGTAAAATGC TCAACCCTGT GTCCTGGGCC TGGGOCTGCT GTGACTGACC 2880 

TACAGTGGAC ■niVIVitritj GAATG6AACC TTCTTAGGCC TOCTGGTGCA ACTTAATTTT 2940 
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10 
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TTTTTTTAAT GCTATCTTCA AAAOSTTASA GAJVAOTTCTT CAWVAOTGCA GCCCAGAGCT 
OCTQGGCOCA CTOGCOGTCC TGCATTTCTO GTTTCCAGJIC OOCMITGOCT CCCATTCGGA 
TGGATCTCTO CGTTTTTATA CTGAGTGTGC CTAGOTTGCC CVrrAriTri TATTTTCCCT 
GTTGOGTTGC TATAGATGAA GGGTGAGGAC AATOSTGTAT ATGTACTAGA ACTTTTITAT 
TAAAGAAACT TTTCCCAGAA AAAAA 



PCT/US02/12476 



3000 
3060 
3120 
3180 



Seq ZD NOi 539 Protein sequence 
Protein Accession #: NP_0017e4.2 



1 

I 

HSLPRGPXiAS 
QEPALPSTDN 
KGPFPQRWQ 
YELF6KAVSE 
DBmAIYTYN 
TDMDGD6STT 
AHRATYLIKS 
PTSTATIWH 
DPAGNliAMDP 
VNDHGPVPBP 
TWLSLKKPL 
GAVLAIiLFLL 
GIiEARPEWL 
DYEGSGSDAA 



11 
I 

LUiLQVCHLQ 
DDFTVRNGET 
LK5NXDRDTX 
KGASVBDPMK 
GWAYSIHSQ 
TAVAWEILD 
GODGDHFTIT 
VEDVKEAPVF 
DSGQVTAV6T 
RQITZOIQSP 
KQDTYDVHLS 
tiVLLLLVKKK 
RNDVAPTIIP 
SLSSliTSSAS 



21 
I 

CAASEPCRAV 

VQERRSIiKER 
IPYSITGPGA 
ISirVTDQND 
EPKDPHDUIP 
ANOKAPMFDP 
THPESNQGXL 
VPPSKWEVQ 
LDREDEQFVR 
VRQVLNITDK 
L5DHGNKEQL 
RKIKEPLTiTiP 
TPMYRPRPAN 
DQDQDYDYUJ 



31 
I 

FREAEVTIiEA 
NPLKZPPSKB 
DSPPEGVFAV 
KKPKFTQDTF 
TZHRSTGTIS 
QKYEABVPEH 
TTRKGLDFEA 
BGIPTGBPVC 
NNIYEVMVIA 
OLSPBTSPFQ 
TVIRATVCDC 
EDDTRDNVPy 
POBIGNFIIE 
EKGSRFKKLA 



.Seq ZD NO; 540 DMA sequence . 
Nucleic Acid Accession #t Eos sequence 
Coding sequence i 1 . . 672 



ATGAOGCTCC 
GGGGGCTCCC 
AAC3GGOGGGG 
CTGCTCGCCT 
6CX3AGACAAC 
TGTCATGTTT 
ACAGA6CCAT 
AAGCAGTGCr 
CTCCTGGAAG 
TTAGAGG6GC 
AQCTGTQGTG 
AGCCTGTCTT 



11 
I 

AAAGACCCOG 
CCTACCCGCC 
AG088GCGCC 
TGCTGCTGGT 
GA6ATCCAGA 
GIGAGAGAGA 
ACT606TTAT 
O06CTGGTT6 
AGCCCATGCC 
CAOCTATCAA 
GGCrGIGGCT 
GA 



21 

1 

ACAGGCCXX33 
AGACCCGGGG 
GCGCGCTGAC 
CGTGGCCCTA 
GGACTCCCA6 
AAACACTTTC 
AGCGGC0GTX3 
TGCAGCGATG 
CTTCTTTTAC 
CTCATCA6TG 
GGCCATOCrC 



31 
I 

GCGGGTGGGA 
AGAGGCGCGC 
OCTCCCTGGG 
CCGCGGGTGT 
OGAACGGACG 
GAGTGCCAGA 
AAAATATTTC 
GAGAGACCCA 
CTCAAGTGTT 
TTCAAAGAAT 
CTGCTGC30Q 



41 

I 

GGABQEPGQA 

XUIRHKRDHV 
EKETGKLIjUf 
RGSVLBGVLP 
VZSSGU)REK 
hVQSEVQBLT 
KNQHTLYVBV 
VYTAEDPDKE 
MDNGSPPTTG 
AQLTSDSDZy 
HGHVETCPGP 
YGEBGGGEED 
NLKAANTDPT 
DMYGGGEDD 



41 

I 

GGOSCGCGCC 
GGAGGCTGCG 
CACCGCTGGG 
GGACAGAOGC 
AGGGTGACAA 
ACCCAAGGAG 
CACGTTTTTT 
A6CCAGAGGA 
GTAAAATTCG 
ATGCTGGGAG 
CCTCCATTGC 



51 
I 

IiGKVPMGCPG 
VAPISVFENG 
KPLDREEIAK 
GTSVMQVTAT 
VPSYTLTZQA 
VTDLDAPNSP 
TNEAPFVLKL 
NQKISYRILR 
TGTLLIiTLlD 
WTABVNEEGED 
WXGGFZIiFVL 
QDYDITQLHR 
APPYDTLLVF 



51 
I 

CCX^GGGCGGG 
AAGGTTCCAG 
GACGATGGOG 
CAACCTGACT 
TAGAGTGTGG 
GT6CAAATGG 
CATGGTTGCG 
GAAGCX5GTTT 
CTACTGCAAT 
CA1X3G6TGAG 
AOCCXSGCCTC 



Seq ID NOt 541 Protein sequence 
Protein Accession Eos sequence 



31 



41 



51 



1 11 21 

I I I i ] 1 

MRIiQRPRQAP AGGRRAPRGG RGSPYRH3PG RGARRLRRFQ KGGEGAPRAD PPHAFLGTMA 
LLAUiXtWAL PSVHTDANLT ARQRDPEDSQ RTDEGDNRVH OIVCERENTF BOQNPRBCiCH 
TSPYCVIAAV XZFPRFFMVA KQCSAGCAAH ERPKPEEKRF XiLEEPMPFPy LKGCRIRYCN 
LB8PPZN8SV FKByAGSMGB SCG6Z.NLAZL ZiLLASZAAGL SLS 

Seq ID KOt 542 DNA sequence 

Nucleic Acid Accession |: XM_035292.2 

Coding sequence: 53.. 1576 " 



GCTC6CTGG6 
TGGGG6C00Q 
GGAGAA6ATO 
CGTGACCCTG 
TATC6GCTCG 
6CTGG0GCTG 
GGGGGAGCTC 
CTAO6GCTO0 
ATCGCA6TAC 
CTGCCCGGTG 
GGCGGTGAAC 
CAAQCTCCTG 
TGTGTCCAAT 
T6TGCTG6CA 
CACAGAGGAA 
CATCGTGA06 
6CAGATGCIG 
GTCCTG6ATC 
GTTCACATCC 
CTCCATGATC 
GA OGCTGCTC 
CftACIGGCTC 
TGA6CTTQAG 
CCTCTTCCTO 
CATCATCCTC 
GTGGCTCCTC 



11 

I 

CXX30GGCTCC 
AAGC0GG6OG 
CTGOCCGCCA 
CAGCG6AACA 
GGCATCTTCG 
GTG6TGTGG0 
GGCACCAGCA 
CTGCCOGCCT 
ATCGTGGCCC 
CCCGAGGAGG 
TGCTACAGCG 
GCCCTG6CCC 
CTAGATCCCA 
TTATACAGCG 
ATGATCAACC 
CTG6TGTA06 
TOGTCCQAGG 
ATCCCOOTCT 
TCCAGGCTCT 
CACCCACAGC 
TACGCCTTCT 
TGGGTOGCCC 
06GCCCATCA 
ATC3QC0GTCT 
AGOGGGCTGC 
CAGG6CATCT 



21 

I 

0G66TGTCCC 
CQCTAOGOGC 
AGA60600GA 
TCAOGCTGCT 
TGACGCCCAC 
CCG0QTGC6G 
TCTCCftAATC 
TCCTCAAGCT 
TGGTCTTC6C 
CAGCCAAGCT 
TGAAG6C0GC 
TGATCATCCT 
ACTTCTCATT 
GCCTCTTT6C 
CCTACAGAAA 
TGCTGAOCRA 
CCQTGOOOGT 

TCTT06TGGG 
TCCTCACCCC 
CCAAGGACAT 
TUUCCATCAT 
AGGTGAACCT 
CCTTCTGGAA 
COGTCTACTT 
TCTCCACGAC 



31 
I 

AGGCCCGGCC 
GC0600S6CC 
CG6CT0S60G 
CAA0GG0GT6 
GGGCGTGCTC 
CGTCTTCTCC 
60G0GG0SAC 
CTG6ATGGAG 
CACCTACCTG 
CGTGGCCTGC 
CACCOQGQTC 
GCTGGGCTTC 
TGAAGGCACC 
CTATGGAGGA 
CCTGCCCCTG 
CCTG6CCTAC 
G6ACTT06GG 
GTCCTGCTTC 
GTCC06GGAA 
OGT6CCGTCC 
CTTCTC O GTC 
C8GCATGATC 
GGCCCTGCCT 
GACACCCGTG 
CTTOGGGGTC 
0GTCCTGT6T 



41 

I 

GGTGGGCAGA 
GAGGAGAAG6 
CCG6CAG60B 
QCCATCATCG 
AAGGAGGCAG 
ATCGT6G606 
TACGCCTACA 
CT6CTCATCA 
CTCAAGCOGC 
CTCTGCGTGC 
CAGGATGCCT 
GTCCAGAT08 
AAACTGGAT6 
TGGAATTACT 
GCCATCATCA 
TTCACCACCC 
AACXATCAOC 
GGCTOCGTCA 
GGCCACCT6C 
CraSTOTTCA 
ATCftACTTCT 
TGGCTGOOOC 
GTGTTCTTCA 
GAGTGTGGOV 
T6GTGGAAAA 
CAGAAQCrCA 



51 

I 

GCATG6CGGG 
AAGAG6C608 
A0660GA6GG 
TOGGGACCAT 
GCTCGCCGGG 
OGCTCTGCTA 
TGCtGGAGGT 
TOOGGCCTTC 
TCITCCCCAC 
TGCTGCTCAC 
TTOCCGCCGC 
QAAAGGGTGA 
TGGG6AACAT 
TGAATTTCGT 
TCrCCCTGCC 
TGTCCACCGA 
TG660GTCAT 
AT6G6TCCCT 
CCTCCATCCT 
OGTQTGTGAT 
TCAGCTTCTT 
ACAGAAAGCC 
TCCTGGCCTO 
TCG6CTTCAC 
ACAAGCCCAA 
TGCAGGTGGT 



60 
120 
160 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



392 



wo 02/086443 

CCCGCAGGAG AO^TAGOCAG GAOGOCCSAGT G6CTGC0GGA GGA6CATGC 



PCT/US02/12476 



5 



Seq ID BOt 543 Protein sequence 
Protein Accession #t XP 035292.2 



X 11 21 31 41 51 

I I I I I I 

MAGAGPKRRA LAAPAAEEKE EAREKMLAAK 8AD6SAPAGE GE6VTLQSHZ TLLN6VAZIV '60 

GTIIGSGIFV TPTGVLKEAG SPGLALWMA AOOVPSIVGA LCYAEIOTTI SKSGGDYAYM 120 

10 LEVYGSIiFAP LKXjHIEI«LII EU^SSQYIVAL VFATYLLKPL FpTCFVPEEA AKLVACX/rVL 180 

UiTAVNCrSV KAATRVQDAP AAAIOiIALAL IZLUSFVQIG KSDVSNUJPN FSPE6TKUDV 240 

GinVLALYSG LFAYGGWNYL KFVTEEHZHP YRNLFLAIXZ SLPZVTIiVyV LXmAYFTTL 300 

STBQMLSSEA VAVDFQIYHL GVMSHIIPVF VGLSCFGSVH GSliFTSSRLP FVQSREGHXiP 360 

SIIiSMIHPOL LTPVPSLVFT CVMTLLYAPS KDIFSVINPF SFFIJWI#CVAL AIIOnWLRH 420 

IS RKPELERFZK VNLALFVFPI LACLFLIAVS FHXTPVEOGI GFTIZLSGLP VYFFGVWWXH 480 
KPKHUjQGIP STTVLGQKLM QWFQBT 

Seq ZD NO: 544 DNA sequence 
Nucleic Acid Accession NM_005268.1 
ZU Ooding sequence: 168.. 989 

1 11 21 31 41 51 

1111)1 

TAAAAAGCAA AAGAATTCGC GGCGGCGTOG ACAOGGGCTT CCCCGAAAAC CTTCCCCX5CT 60 

25 TCTGGATATG AAATTCAAGC TGCTTGCTGA GTCCTATTGC OGGCTGCTGG GAGCOWSGAG 120 

AGCCCT6AGG AGTAGTCACT CAGTAGCAGC TGACGOGTGG 6TCCA0CATG AACTGGASTA 180 

• ■ TCTTTGAGGG ACTCCTGAGT GGGGTCAACA AGTACTCCAC AGOCTTTGGG COCATCT66C 240 

TGTCTCTGGT CTTCATCTTC 0GC3GTGCTGG TGTACCTGGT GACOGCCGAG CGTGTGTG6A 300 " 

^ GTGATGACCA CAAGQACTTC GACTGCAATA CTCGCCAGCC CGGCTOCTCC AACGTCTGCT 360 

30 TTGATGAGTT CTTCCCTGTG TCCCATGTGC GCCTCTGGGC CCTGCAGCTT ATCCTGGTGA 420 

CATGCCCCTC ACTGCT08T0 GTCATGCACG tGGCCTACOG G6AGGTTCAG GAGAA6AGGC 480 

ACOQAGAAGC CCATGGQGAO AACA6TGG6C GOCTCTACCT GAACCCCGGC AAGAAGOSGO 540 

GTGGGCTCTG GTGGACATAT GTCTGCA6CC TAGTGTTCAA GGCGAGCGTG GACATOSCCT 600 

TTCTCTATGT GTTCCACTCA TTCTACCCCA AATATATCCT CCCTCCTGTG GTCAAGTGCC 660 

35 ACGCAGATCC AT6TCCCAAT ATAGTGGACT GCTTCATCTC CAAGCCCTCA GAGAAGAACA 720 

TTTTCACCCT CTTCATOGrG GCCACAGCTG CC3VTCTGCAT CCTGCTCAAC CTCGTGGAGC 780 

TCATCTACCT GGTGAGCAAG AGATGCCA06 A6TGCCTGGC AGCAAGGAAA GCTCAAGCCA 840 

TGTGCACAGa TCATCAOCCC CAC3GGTACCA GCTCTTCCT6 CAAACAAGAC GAOCTCXnTT 900 

. CX3GGT6ACCT CATCTTTCTG GGCTCAGACA GTCATCCTCC TCTCTTACCA GACCOCOCXX: 960 

40 GAGACCATGT 6AAGAAAACC ATCTTGTGAG GGGCTGCCTG GACTGGTCTG GCAOQTTGGG 1020 

CCTGGATQGQ GAGGCTCTAG CATCTCTCAT AQGTQCftACC TGAGAGTGGG GGAGCTAAGC 1080 

CAT6AGGTAG 6GGCASGCAA GAGAGAGGAT TCA0A06CTC TGGGAGCCAG TTOCTAGTCC 1140 

TCAACTCCAO CCAOCTQCCC CAGCTOGACG 6CACT6GGCC AGTTCCCXXrr CTGCTCTGCA 1200 
GCTOGQTTTC CTTTTCTAGA ATGGAAATAG TGAGGGCCAA TGC 



45 



55 
60 



80 
85 



Seq ID NO: 545 Protein sequence 
Protein Accession #: NP_005259.1 



1 11 21 31 41 51 

50 1 1 I I I I 

MNWSIPBGLL SGVNKYSTAP GRIWLSLVPZ PRVLVYLVTA ERVWSM)HKD FDCNTRQPGC 60 

8NVCFDEFFP VSHVKLWALQ LZLVTCPSLL WMHVAYSEV QEXRUREAHG ENSGRLYLKP 120 

GKXSGGLHHT YVCSLVFKAS VDIAFLYVFB SFYPKYILPP WKCBADPCP NIVDCFZSKP 180 

SBKNIFTLFM VATAAZCIZ.L NLVELIYLVS KRCHECLAAR KAQAHCT6BH PHGTTSSCKQ 240 
DDLLS6DLIP LGSDSHPPLL PDRPRDHVKK TIL 



Seq ZD NOt 546 DNA sequence 

Nucleic Acid Accession ft: HM_002391. 

Ooding sequence: 26.. 457 ~ 



1 11 21 31 41 51 

I 1 I I I I 

0666GC2AAGC AG0Q00G6CA GGGAOATGCA GCAOOGftGGC TTCCTCCTCC TCAOOCTCCT 60 

OGCCCTGCTG G08CTCACCT CC3GOGGTC6C CAAAAAGAAA GATAAG6T6A AGAAGGGOGG 120 

65 CC0GGGGA6C GAGTGCGCTG AGTGGGOCTG GGGGCCCTGC ACCCCCA6CA GCAAGGATTG 180 

CGGCGTOGGT TTCC60GAGG GCACCTGOGG GGCCCAGACC CAGOQCATCC GGTCCAGGGT 240 

GCCCTGCAAC TGGAAGAAGG AGTTTGGAGC CXSACTGCAAO TACAAOTTTG AGAACTGGGG 300 

T60GTGTQAT GGGGGCACAG OCACCAAAOT COQCCAAGOC ACGCTQAAGA A060G06CTA 360 

CAATGCTCAO TOCQUGGAGA CX3VTC0G09T CACCAAGCOC TGCACCOCCA A6ACCAAAGC 420 

70 AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGA06 CCAA6CCTGG ATGCCAAG6A 480 

GCCCCTGGTQ TCACATGGGG CCTGGCCAOG CCCTCCCTCT CCCAGGCX:OG AQATGTGACC 540 

CACCAGTGOC TTCTGTCTGC TOGTTAGCTT TAATCAATCA TGCCCTGCCT TOTCCCTCTC 600 

ACZ'COOCAGC COQVOOOCTA A0T6CCCAAA 0T6GGGAGG8 ACAAGG8AT7 CT0GGAA6CT 660 

TGACCCTC3CC CCAAAQCAAT GTOAOTCOCA GAGCCOGCTT TTGTTCTTCC OCACAATTCC 720 

75 ATTACTAAOA AACMATCAA ATAAACIGAC TTTTTCCCXX: CAATAAAAGC I ' CTX ' CTrJTT 780 
TAATAT 



Seq ZD NO: 547 Protein sequence 
Protein Accession 8: NP_002382.1 

1 11 21 31 41 ' 51 

I I I t I ( 

KQERGFLUiT LLAliLALTSA VAIOCKDKVKK OGP6SBCAEH ANGPCTPSSR DG6VGPREGT 60 
CGAQTQRZRC RVPOIHKKEP QADCRYKFEK tfGAC3)GGTGT XVRQGTLKKA RYNAQOQBTZ 120 
RVTKPCTPKT KAKAKAKKGK GKD 

Seq ID NO* 548 DNA sequence 



393 



wo 02/086443 
Hucleie Acid Accession I: 
Oodiag sequencei 1..786 



PCTAJS02/12476 



DM 006783.1 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



ATGGATTGGG 
GGGAAGGTGT 
CAGGAAGTGT 
AAAAATGTGT 
CTGATCTTOG 
GAAAOCACTC 
ATTAAAAAGC 
TTTTTCCGAA 
TACCAOCTGC 
TTTATTTCTA 
ATTTGCAT6C 
AOATCAAAGA 
CACSAATGAAA 
A6CTAA 



11 
I 

GGAC6CTGCA 
G6ATCACAGT 
GGGGTGA06A 
GCTATGACCA 
TCTCCACCCC 
GCAAfiTTCAG 
ACAAGGTTOO 
TCATCTTTGA 
CCTGGGTGTT 
G6CCAACAGA 
T6CTTAA0GT 
GAGCACAGAC 
TGAAT6AGCT 



21 
I 

CACTTTCATC 
CATCTTTATT 
GC3UU3AGGAC 
CTTTTTCCCG 
AGQGCTGCTG 
6CGAG6A{SAg 
6ATAGAG66G 
AGCA6CCTTT 
GAAATGTGGG 
6AAGAC06T6 
GOCAQAGTTO 
GCAAAAAAAT 
GATTTCAOAT 



31 

I 

QGG GO TGTCA 
TTCOQAGTCA 
TTOGTCTGCA 
GTGTCCCACA 
GTGGCCATGC 
AAfiAGQAATO 
T OUCIXSTOS T 
ATGTATGTGT 
ATTGACCCCT 
TTTACCATTT 
TGCTACXZTGC 
CAOOOCAATC 
AGTGGTCAAA 



41 

I 

ACAAACACTC 
TGATCCTAGT 
ACACACTGCA 
TC0GGCTC3TG 
ATGT6GCCTA 
ATTTCAAAGA 
66A0GTACAC 
TTTACTTCCT 
GCCCCAACCT 
TTATGATTTC 
TGCTGAAAGT 
AIGCCCTAAA 
A3GCAATCAC 



51 
I 

CACCAGCATC 
GGTGGCTGCC 
ACX3GGGATGC 
GGCCCTCCAG 
CTACAGGCAC 
CATA6AGGAC 
CAGCAGCATC 
TTACAATGGG 
TGTTGACTGC 
TGG6TCTGTG 
GTGTTTTAGG 
GGAGAGTAAG 
AOGTTTOCCA 



Seq ID NO I 549 Protein sequence 
Protein Accession it NP_006774.l 

1 11 21 31 41 51 

I I I I I I 

MDWGTLHTFI GGVKKHSTSI GKVWITVIFI FRVMILWAA QEVNGOEQED FVQm«FGC 
KNVCYDHFFP VSHIRIiWALQ. LIFVSTPALL VAMHVAYYSK ETTSKFRSGB KRNDFKDIED 
IKKHKVRIBG SLWWTYTSSI FPRIIFEAAF KifVPVPLYNO YHLPWVLKGO IDPCPNLVDC 
FISRPTEKTV FTIFHISASV lOOiLNVAEL CYUiXjKVCFB R5KRAQTQKH HFHHALKESK 
QNEMKBLISD SGQKAITGFP S 

Seq ID NO: 550 DMA sequence 

Nucleic Acid Accession ftt NM_002571.1 

Coding sequences 99.. 587 



CATCOCrCTG 
TCACCCTGGG 
AGGACCTGGA 
ACATCTCCCT 
CCAC0CCCX3A 
AGAAGAAGGT 
T6606AA0GA 
AGGACACCAC 
AGGAOGATGA 
GGTACTTGCT 
CCAGGAAGAC 
T TTCAA AGAA 
TOCtGCTGCA 
GCAGAGGTTA 



11 

I 

GCTGCAGAGC 
OGTGGCCCTG 
GCTCCCAAAG 
CATGGOGACA 
GGACAACCTG 
OCTTQQAGAO 
(K3CCAC6CTG 

GATCATGCAG 
GGACTTGAAA 
CAGACTCCCA 
TAAOCACAGC 
CACCTGCACC 
TTAATAAACC 



21 

I 

TCAGAGCCAC 
GTCTGTOGTO 
TTGGCAGGGA 
CTGAAGGCCC 
GAGATCGTTC 
AAGACTGG6A 
CTC6ATACTG 
CAGAGCATGA 
GGATTCATCA 
CAGATGGAAG 
GGCITCCACA 
TCAGAAGAOO 
ATTGCCATQO 
CTTGGAGGAT 



31 

I 

CCACAGCCX3C 
TCCOGGCCAT 
CCTGGCACTC 
CTCTGAGGGT 
TGCACAGATG 
ATCCAAA6AA 
ACTACXSACAA 
TGTGCCAGTA 
GGGCTTTCAG 
AGCCGTGCCG 
OCTCCAGAGC 
ATGAOGTGGT 
GQAOGCTGCT 
G 



41 

I 

AGCCATGCTG 
GGACATCCCC 
CATGGCCATG 
CCACATCACC 
GGAGAACAAC 
GTTCAAGATC 
TTTCCTGTTT 
CCTGGCCAGA 
GCCCCTGCCC 
TTTCTAGCTC 
AGTGGGACTT 
CATCTGTG7C 
CCCTOGGGGC 



51 

I 

TGCCTCCTGC 
CAGACCAAGC 
GCGACCAACA 
TGACTGTIGC 
AGCTGTGTTG 
AACTATAOGG 
CTCTGCCTAC 
GTCCTGGTGG 
AGGCACCTAT 
ACCTCOSCCT 
CCTCCTGCCC 
GCCATCXXICT 
AGAGTCTCTG 



Seq ZD MOt 551 Protein sequence 
Protein Accession «t NF_002562.1 

1 11 21 31 41 51 

1 I I I I I 

MDIPOTKODL BLPKLAGTHB SMAMAINNIS INATLKAPLR VHITSLLPTP EDNItEIVLHR 
HBNNSCVEKK VL6BXTGNPK KFKZXSYTVAN BATLLDTDYD NFItPLCLQDT TTPIQSNMOQ 
YLARVLVEDD BIMQ6FISAF RFLPRHXAm* LDLKQHEEPC RF 

Seq ID NOt 552 DMA sequence 

Nucleic Acid Accession #: NM_006S00.1 

Coding sequence: 27.. 1967 



ACTT6C8TCT 
TOSOOSCCTG 
06CCTGAGCT 
AGTCCCAAGG 
TCATCTTCCG 
TGAOCCTCCA 
6CATCTTCTT 
TCTACAAAGC 
GTAAGGAGCC 
TCATCTGGTA 
GSICOCAGAC 
TGGTTAAAGA 
GGAACCACAT 
TGTGGCTGGA 
6TTTGGCTGA 
GGGAGGCAGA 
AOGAACACAO 
TGAGT6AACC 
" OOCCTGAGAG 
ACCT06AGTT 



11 

I 

GQCCCTCOGG 
CTeCTGCTGT 
GGTGGAGGTG 
CAACCrCAGC 
TGTGOGCCAG 
OQACAGAOGG 
GTGCCAGGGC 
TCCX3GAGGAG 
T6AQGAGGTC 
CAAQAAT66C 
TSTGOAGTOO 
AGACAAAGAT 
GAAQGAGTCC 
AGTGGAGCCC 
TG6CAACCCT 
GGAAGAOACA 
TQG008CTAT 
ACAG6AACTA 
ACAGGAAGGC 
CCAGT66CT6 



21 
I 

CCAAGCATGG 
OCTCGCSGTOS 
GAAGTG6GCA 
CATGTCGACT 
GGCCAGGGCC 
OCXACTCrGG 
AAGCGCOCrC 
CCAAACATCC 
GCTACCTGTG 
OGGCCTCTGA 
AGTGGTTTGT 
GCCCAGTTTT 
AGGGAAGTCA 
GTGGGAATGC 
CCAOCAGACT 
ACCAAC38ACA 
GAATGTCAOG 
CTGGTGAACT 
AGCAGOCTCA 
AGA6AAGAGA 



31 
I 

G6CTTCCCAG 
OSGGTOTGCC 
GCACAGCCCT 
GGTTTTCTGT 
AGAG06AACC 
OCCTQACTCA 
GGTCCCAQGA 
AGGTCAACCC 
TAGGGAGGAA 
AGGAGGAGAA 
ACACCI7t3CA 
ACTGTGA6CT 
CCGTCCCTGT 
TGAAGGAAGG 
TCAGCATCAG 
AOSGaOTGCT 
CCTGGAACTT 
ATGTGTCTGA 
CCCTGACCTG 
CAGAOCAGGT 



41 
I 

6CTGGTCTGC 
OGGAGAGGCT 
TCTGAAGTOC 
CCACAAGGAG 
TQGG6A6TAC 
AGTCACOCCC 
GTACC6CATC 
CCTGGGCATC 
CGGGTACCCC 
QAAC0666TC 
GASTATTCIG 
CAACTAC0G6 
TTTCTACCOQ 
GGACCGCGTG 
CAAGCAS^ 
GGT6CIGGA0 
G6ACACCAT6 
CGTCCGAGT6 
TGAGGCAGAG 
GCTGGAAA6G 



51 

I 

GCCTTCTTGC 
GAGCAGCCTO 
GGCCTCTCCC 
AAGOGGACGC 
6AGGAG0G6C 
CAAGAOGAGC 
CAGCTCCG06 
CCT6TGAACA 
ATTCCTCAAG 
CACATTCAGT 
AAGGCACA6C 
CT60CCAGTG 
ACAGAAAAAG 
6AAATCA6GT 
CCCAGCACCA 
0CT60C06GA 
ATATOGCTGC 
AGTCCOSCAG 
AGTAGCCAGG 
GGGCCT6T6C 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
ISO 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



394 



wo 02/086443 

TTCAGTTGCA TGACCTGAAA CGGGAGGCAG GAGGOGGCTA TOGCTGOGT G GOGTCTGTGC 1260 

CCAGCATACC OGGCCTGAAC CX3CACACRGC TGGTCAAGCT GGOCATrTTT GGCCUCCCl'T X320 

GGATGGCATT CAAGGAQAGG AAGGTGT6G6 TGAAAGAGAA TAT6GTGTTG AATCTGTCTT 1380 

GTGAAGOGTC AGGGCACCOC OGGCCCAOCA TCT C CT GG AA OGTCAAOGGC A06GCAAGTG 1440 

AACAAGACCA AGATCCACAG OGAGTCCTGA GCACCCTGAA* TGTCCT0GT6 ACCCOQGAGC 1500 

TGTTGGAGAC A66TGTTGAA TGCACGGCCT CCAAC3GACCT GGGCAAAAAC ACCAOCATCC 1560 

TCTTCCTGGA GCTGGTCAAT TTAACCACCC TCACACCAGA CTCCAACACA ACCACTGGCC 1620 

TCAGCACTTC CACTGCCAGT CCTCATACCA GAGCCAACAQ CACCTCCACA GAGAGAAAGC 1680 

TGCCGGAGCC GGAGAGCCGG GGCGTGOTCA TCGTGGCTQT GATTGTGTGC ATCCTGGTCC 1740 

TGGCGGTGCT GGGCGCTGTC CTCTATTTCC TCTATAAGAA QQGCAA6CTG CC3BTGCAGGC 1800 

GC7CAG6GAA GCAGGAGATC AOGCTGCCCC CX?rCTCGTAA GACOGAACTT GTAGTTGAAG 1860 

TTAAGTCAGA TAAGCTCCCA GAAGAGATGG GCCTCCTGCA GGGCAGCA6C GGTQACAAGA 1920 

GG6CTC0GGC AGACCAGGGA GAGAAATACA TCGATCTGAG GCATTAGCXX: OGAATCACTT 1980 

CAGCTCOCTT CCCTGCCTGG ACCATTCCCA GCTCCCTGCT CACTCTTCTC TCAGCCAAAG 2040 

CXrrCCAAAGG GACTAGAGAO AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG 2100 

GGGCACTGGG TTAGGACCTO AG6A0CTCAC TT66CCCTGC AAGCCGCTTT TCAGGGACCA 2160 

GTCCACCAOC ATCTCCTCCA CGTTGAGTGA AGCTCATCOC AAGCAAGGAQ CCCCAGTCTC 2220 

CCGAGCGGGT AG6AGAGTTT CTTGCAGAAC GTGTTTTTTC TTTACACACA TTATGGCTGT 2280 

AAATACCTGG CTCCTGCCAG CRGCTGAGCT GGGTAGCCTC TCTGAGCTGG TTTCCTGCCC 2340 

CAAA66CIGG CTTCXZACCAT CCAGGTGCAC CACTGAAGTG AG6ACACACC 6GAGCCAGGC 2400 

GCCTGCrCAT GlTGAAiGTGC GCTGTTCACA CCOGCTCOGG A6AGCACC0C AG06GCATCC 2460 

AGAAGCaGCT 6CAGTGTT0C TGCCACCACC CTCCTGCTOG CCTCTTCAAA GTCl-CCTgrO 2520 

ACATTTTTTC TTTGGTCAGA A6CX»GGAAC TGGTGTCATT CCTTAAAAGA TACGTGCCGG 2580 

GGCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGA6GC06A GGOGGGCGGA 2640 

TCACAAAGTC AQGACGAGAC CATCCTQGCT AACAOSGTQA AACCCTGTCT CTACTAAAAA 2700 

TACAAAAAAA AATTAGCTAO OCGTAOTGGT TG6CACCTAT AGTCCCAGCT ACTOGGAAOO 2760 

CTGAA6CAGG A6AATGGTAT GAATCCAGGA GGTGGAGCTT 6CAGTGAGCC GAGACG6T6C 2820 

CACTGCACTC CAGCCTGGGC AACACAGCGA GACTCOGTCT OGAGGAAAAA AAAAGAAAAG 2880 

A0G0C5TACCT GOGGTGAGGA A6CTGGG0GC TGTTTTaSAG TTCAGGTGAA TTAGCCTCAA 2940 

TCCCOCTGTT OICTTGCTCC CATAGCCCTC TTGATGGATC A06TAAAACT GAAAGGCAGC 3000 

G6GQA6CA0A CAAAGATGA6 GTCTACACTO TCCTTCATQO GQATTAAA6C TATG6TTATA 3060 

TTA6CACCAA ACTTCTACAA ACCAAGCTCA GG6CO0CAAC 0CTAGAA006 CXXrAAATOAO 3120 

AGAATGGTAC TTAGGOATGO AAAAOGGGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTQT 3180 

CTGTGTOTAT GCATACATAT GTGTGTATAT ATGGTTTTGT CAGGTGTGTA AATTTGCAAA 3240 

TT6TTTCCTT TAZATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT 3300 

AAAGCTTAAT TGTCCCAGAA AATCATACAT TGCTTrTTTA TTCXACATG6 GTAOCACA0O 3360 

AACCTGGGGG CCTGT6AAAC TACAACCAAA AGGCACACAA AACGGTTTCC AGTTGGCA6C 3420 

AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCA AGCTCTACCA GAGCAGACAO 3480 

CTACCCTACT TTTCAGCAGC AAAACGTCCXT GTATGACGCA GCACGAAOGO CCTGGCAOOC 3540 
TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTOOGTCCA CTT 

Seq ID NO: 553 Protein sequence 
Protein Accession St NP_006491.l 

1 11 21 31 41 51 

I I I I I I 

GLPRLVCAFL LAACCCCPRV AGVPGEAEQP APELVEVEV6 STALLKOGLS QSQGNLSRVD 60 

WFSVHKEKRT LIFRVRQGQG QSEPGEYEQR LSbODRGATl* ALTQVTPQDB RIFLOQGKRP 120 

RSQEYRIQLR VYKAPEEPNI QVNPI.GIPVN SKEPEEVATC VGRNGYPIPQ VIWYKNGRPL 180 

KEEKNRVBIQ SSQTVESSGL YTLQSILKAQ LVKEDKDAQP YCELKYRLPS GNHMKESRKV 240 

TVPVFypTBK VNLSVBPVGM LKBGX2RVEXR CLADGHPPPB FSISKQNPST REABBBTTHD 300 

N6VLVLEPAR KEHSGRYBOQ AMNLDTMISL LSEPOEliLVN YVSDVRVSPA APBRQBGSSL 360 

TLTCEAESSQ DLEPQWLREE TDC3VLSRQPV LQLHDLKREA GGGYRCVASV PSIPGLNRTQ 420 

LVKIiAIFGPP WMAFKERKVH VKEKMVLNLS CEASCTPRPT ISHNVNGTAS EQDQDPQRVL 480 

STUJVLVTPB LLETGVECTA SNDU5KHTSI LFLELVNLTT I.TPDSWTTTG LSTSTASPHT 540 

RANS7STERK LPSPESR6W IVAVIVCZLV LAVLGAVLYF LYKKGKLPCR RSOXQEITLP 600 
PSRKTELWE VKSDKLPEEM GIiLQGSSGDK RAPGDOGERY ZDLRH 

Seq ZD NOt 554 DKA sequence 

Nucleic Acid Accession S: NM_003183.3 

Oodisg sequence: 165.. 2639 

I 11 21 31 41 51 

i i I i 1 I 

TCXSAGCCTGG CGGTAGAATC TTCCCAGTAO GOGGCGCGGG AGGGAAAAGA GGATTGAGGG 60 

GCTAGGC08G G0QGATCCCX3 TCCTCOCCOG ATGTGAGCAG TTTTCCGAAA OCCOGTCAGQ 120 

OQAAGGCTGC <XAGAGAGGT GGAGTCGGTA GCGGGGCCGG GAACATGAGO CAGTCTCTCC 180 

TATTCCTGAC CAGCGTGGTT CCTTTOGTGC TGGCGCOGOG ACCTCXX3GAT 6ACCCGGGCT 240 

TCGGCCCCCA CCAGAGACTC GAGAAGCTT6 ATTCTTTGCT CTCAGACTAC GATATTCTCT 300 

CTTTATCTAA TATCCAGCAG CATTCGGTAA GAAAAAGAGA TCTACAGACT TCAACACATG 360 

TAGAAACACT ACTAACTTTT TCAGCTTTGA AAA6GCATTT TAAATTATAC CTQACATCAA 420 

GTACTGAAOG' TTTTTCACAA AATTTCAAGO TCGTGGTGGT GGATGGTAAA AAOGAAAGOQ 480 

AGTACACTGC AAAATG6CAG GACTTCTTCA CTG6ACA0GT GGTTGGT6AG CCTGACTCTA 540 

GGGTTCTAGC CCACATAAGA GATGATGATG TTATAATCAG AATCAACACA GATG GGGC CG 600 

AATATAACAT AGAGCCACTT TCGAGATTTG TTAATGATAC CAAAGACAAA AGAATGTTAG 660 

TTTATAAATC TGAAGATATC AAGAATGTTT CACGTTTGC3V GTCTCCAAAA GTGTGTGGTT 720 

ATTTAAAA6T GGATAATGAA GAGTTGCTCC CAAAAGGGTT AGTAGACAGA GAACCACCT6 780 

AAGAGCTTGT TCATOGAGTG AAAAGAAGAG CTGACCCAGA TCCCATGAAO AACACGTGTA 840 

AATTATTGGT GGTAGCAGAT CATOOCTTCT ACAGATACAT GGGCAGAGGG GAAGAGAGTA 900 

CAACTACAAA TTACTTAATA GAGCTAATTQ ACAGAGTTQA TGACATCTAT CGGAACACTT 960 

CATGGGATAA TGCAGGTTTT AAAGGCTATO GAATACAGAT AGAGCAGATT CX3CATTCTCA 1020 

AGTCTCCACA AQAGGTAAAA CCTGGTGAAA AGCACTACAA CATGGCAAAA AGTT ACCCAA 1080 

ATGAAGAAAA 06AI0CTTGG GAT6TGAAGA TGTTQCTAGA GCAATTTAGC TTTGATATAG 1140 

CTGAG6AA6C ATCTAAAGTT TGCTTGGCAC AOCTTTTCAC ATACCAAGAT TTT GATAT GG 1200 

6AACTCTTGG ATTA6CTTAT GTTGGCTCTC CX3USRCCAAA CAGCCATGGA GGTGTTTGTC 1260 

CAAAGGCTTA TTATAGCCCA GTTGGGAAGA AAAATATCTA TTTGAATAGT GGTTTGAOGA 1320 

GCACAAAGAA TTATGGTAAA ACCATCCTTA CAAAGGAAGC T6ACCT0GTT ACAACTCATO 1380 
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AATTGGGAOV TAATTTTGGA GCAGAACHTG ATOOGGATGG TCTAGCAGAA TGTGCCCOGA 1440 

ATGAiGGACai GGGAGGGAAA TATGTCATGT ATCCCATAGC TGTGAGTG6C GATCAOGAGA 1500 

ACAATAAGAT GTTTTCAAAC TGCAGTAAAC AATCAATCTA TAAGACCATT GAAAGTAAGG 1560 

CCCAGGAGTG TTTTCAAGAA OGCAGCAATA AAGTTTGTGG GAACTCGAGG GTGGATGAAG 1620 

GAGAAGAGTQ TGATCCTGGC A7CATGTATC TGAACAAOGA CACCT6CTGC AACA606ACT 1680 

GCACGTTGAA QGAAGGTGTC CAGTGCAGIG ACAGGAACAG TGCTTGCTGT AAftAA CTQTC 1740 

AGTTTGAGAC T6CCCAGAAG AAGT6CCA6G AGGGGATTAA T3CTACTTGC AAAOGCGTOT 1800 

CCTACTGC3W: AGGTAATAGC AGTGAGTGCC OGCCTCCAGG AAATGCTGAA AATGACACTG 1860 

TTTGCTTGGA TCITGGCAAG TGTAAGGATG GGAAATGCAT CCCTTTCTGC GAGAGGGAAC 1920 

ACCAGCTGGA GTCCTGT6CA TGTAATGAAA CTGACAACTC CTGCAA GGTG TGCTGCAGGG 1980 

ACCTTTCTGO CCXgCl^i'GT G OCCTATGTOS ATGCTGAACA AAAQAACTTA TTTTTQA6GA 2040 

AAGGAAA6CC CTGTACA6TA GGATTTTGTG ACA7GAATG6 CAAAT0T6A6 A AAOGAOT AC 2100 

AGGATGTAAT TGAAOGATTT TXjGGATTTCA TTGACCAGCT GAGCATCAAT ACTTTTGGAA 2160 

AGTTTTTAGC AGACAACATC GTTGGGTCTG TCCTGGTTTT CTCCTTGATA TTTTGGATTC 2220 

CTTTCAGCAT TCTTGTCCAT TGTGTGGATA AGAAATTGGA TAAACABTAT GAATCTCTGT 2280 

CTCI G TTTC A CCCCAGTAAC GTCGAAATGC TGA6CA0CAT G6ATTCTGCA TGGGTTOGCA 2340 

TTATCAAACC CTTTCCTGOO CCCCAGACTC CAGGC0GC3CT GCAGCCTGCC CCTGTGATCC 2400 

CTTCGGCGCC AGCAGCTCCA AAACTGGACC ACCAGAGAAT GGACACCATC CAGGAAGACC 2460 

CCAGCACAGA CTCCCATATG GACGA6GAT0 GGTTTGAGAA GGACCCCTTC CCAAATAGCA 2520 

GCACAGCTGC CAAGTCATTT GAGGATCTCA 0G6ACCATCC GGT06CCAGA AGTGAAAAGG 2580 

CTGCCTCCTT TAAACT6CAG GQTCAGAATC GTGTTAACAG CAAAGMACA GASTGCTAAT 2640 

TTAGTTCTC3V GCTCTTCTGA CTTAAGTGTG CAAAATATTT TTATAGATTT GACCTACAAA 2700 

TCAATCACAG CTTGTATTTT GTGAAGACTG GGAAGTGACT TAGCAGATGC TGGTCATGTG 2760 

TTTGAACTTC CTGCAGGTAA ACAOTTCTTQ TGTGGTTTGO CCCTTCTCCT TTTGAAAAGG 2820 

TAAGGTGAAA GTGAATCTAC TTATTTT6A6 GCTTTCAGGT TTTAGTTTTT AAAATATCTT 2880 

TTGACCTCTG GTGCAAAAGC AGAAAATACA GCTGGATTGG GTTATGAATA TTTACGTTTT 2940 

TGTAAATTAA TCTTTTATAT T6ATAACAGC ACTGACTAGG GAAATGATCA GTTTTTTTTT 3000 

ATACACTGTA ATCAACCGCT GAATATGAAG CATTTGGCAT TTATTTGTGA GAAAAGTGGA' 3060 

ATAGTTTTTT • rmT TTTTT TTTmTTGC CTTCAACTAA AAACAAABGA GATA AATTTA 3120 

GTATACATTG TATCTAAATT GTGGGTCTAT TTCTAGTTAT TACCXAGAST TTTTATGTAG 3180 

CAGGGAAAAT ATATATCTAA ATTTAGAAAT C3VTTTGGGTT AATATGGCTC TTCATAATTC 3240 

TAAGACTAAT GCTCAGAACC TAACCACTAC CTTACAGTGA GGGCTATACA T0GTA6CCA6 3300 

TTGAATTEAT G6AATCTA0C AACTGTTTAG GGCXXTTGATT TGCTGGGCAG TTTTTCTGTA 3360 

TTTTATAAGT ATCTTCATGT ATCCCXtrTTA CTGATAGGGA TACATGTCTT AGAAAATTCA 3420 

CTATTOGCTG GGAQTaOTGG CTCATGCCT3 TAATOCCAGC ACTTGGAGA6 GCTGAGGTTG 3480 
06CCACTACA CTCCASCCTG GGrQACAGAO TOAGATCT8C CTC 

Seq ID MOt 555 Protein eequence 
Protein Acceseion #t NP_003174.2 

1 11 21 31 41 51 

I I 1 I 1 I 

MRQSLLFLTS WPFVIAPRP PI»PGFGFRQ RLBKUISLLS DYDILSbSKZ QQHSVRKSDL 60 

QTSTBVETLL TFSALKRKFX LYIiTSSTSRP SQI7FKVWVD GXZ^BYTAR KQDFFTGHW 120 

GEPDSRVLAH IKDDDVIIRI NTDGAEYNIE PLWRPVNDTK DKRMLVYKSB DIKNVSRI/QS 180 

PKVOGYLKVD NEELLPKGLV DREPPEELVH RVKRRADPDP MKNTCKLLW ADHRPYRYMQ 240 

RGEESTTTHY LIELIDRVDD lYRNTSWDMA GFKGYGIQIB QIRILKSPQB VKPGEKHYDM 300 

AKSYPNEEKD AHDVXMUiEQ FSFDZAEEAS KVCLAHLFTV Q DPPMgT LGL AYVGSPRANS 360 

HGCVCPXAYY SFVGKIVZYIi NSGLTSTKNY GKTILTKEAD LVTTBBUaN FGAEHDPDGL 420 

AECAPKHX2G GKYVMYPIAV SGDHENNKMF SNCSKQSIYK TIESKAQBCP QERSNICVC3QT 480 

SRVDBGBECD PGIMYLNNDT CCNSDCTLKE GVQCSDRNSP CCZNCQFETA QKKCQEAINA 540 

TCKGVSYCTG MSSECPPPGN ABIDTVCLDL GKOOXSKCIP FCEREQQLBS CAOIETDKSC 600 

KVCCRDL56S CVPYVIIABQX laLFLRXOKPC TVGFaiMNGK CBKRVQDVIB RFHDPZOOLS 660 

INTFQKFIAD NZV6SVLVFS LZFWZPPSXXi VBCVDKKUnC QYESLSLFEP SNVEMLSSMD 720 

SASVRXIXPF PAPQTP6RLQ PAPVZPSAPA APKLDHQKMD TIQEDPSTDS HMDEDGFBXD 760 
PFPKSSTAAK SFEDLTDBFV ARSBKAASPK LQRQSRVNSR BTEC 

Seq ZD NOt 556 DKA sequence 

Nucleic Acid Accession ffs NM_021632.1 

Coding sequence! 164.. 2248 

1 11 21 31 41 51 

I I I 1 I I 

TCBAGCCTGG CGGTAGAATC TTCCCASTAG G0GQ06GGGG AGGAAAAGAG GATTGAGGG6 60 

CTAGGCOGGQ CGGATOXGT CCTCCCCCGA TGT6AGCAGT TTTCC6AAAC CC0GTC31SGC 120 

GAAGGCT6CC CA6AGAGGTG GAGTOGGTAG 0GGG6CCGGG AACATGAGGC AGTCTCTCCT 180 

ATTOCTGACC AGOGTGGTTC CTTTOGTGCT GGOGCOGOSA CCTCCGGATG ACCOGGGCTT 240 

GG60C3CCCAC C3U3AGACTCX3 ASAAGCTTGA TTCmOCTC TCAGACIA06 ATATTCTCTC 300 

TTTATCTAAT ATCCAGCAGC ATTOGGTAAG AAAAAGAGAT CTACAGACrT CAACACATGT 360 

AGAAACACtA CTAACTTTTT CAGCTTTQAA AAGGCATTTT AAATTATACC TGACATCAAG 420 

TACTGAAOGT TTTTCaCAAA ATTTCAAGGT CGTGGTGGTG GATGGTAAAA AOGAAAGCGA 480 

GTACACIGTA AAAT6GCAG6 ACTTCTTCAC TGGACACGT6 GTTG6TGA6C CTGACTCTAO 540 

G6TTCXAGGC CACATAAGA6 ATGAT6ATGT TAXAATCAQA ATCAACACAG ATGGGGOOGA 600 

ATATAACATA 6A6GCACTTT GQAGATTTGT TAAT6ATACC AAAGACAAAA GAATCTTAGT 660 

TTATAAATCT 6AAQATATCA AGAAT6TTTC ACGTTTGCAG TCTCCAAAA6 TGTGTGGTTA 720 

TTTAAAAGTG GATAATGAA6 AGTTGCTCCC AAAAGGGTTA GTAGACAGAG AACCACCTGA 780 

AQAGCTTGTT CATOGAGTGA AAA6AAGA6C TGACCCAGAT GCX»T6AASA ACA06TGTAA 840 

ATT A TTG G TG GTA0CA6ATC ATOGCTTCTA CASATACAIG G6CAGAGGGG AAQAGAGTAC 900 

AACTAGAAAT TACTTAATAG AGCTAATT6A CAQASTTGAT GACATCTATC GQAACACTTC 960 

ATGGGATAAT GCAGGTT7TA AAGGCTATGG AATACAGATA GAGCAGATTC GCATTCTCAA 1020 

GTCTCCACAA GAGGTAAAAC CTGGTGAAAA GCACTACAAC ATGGCAAAAA GTTACCCAAA 1080 

TGAAGAAAA6 GATGCTTGGG ATGTGAAGAT GTTGCTAGAG CAATTTAGCT TT6ATATA0C 1140 

TGAGGAAGCA TCTAAAGTTT GCTTGGCACA OCTTTTCACA TACCAAOATT TTGATAIGGG 1200 

AACTCTTGGA TTAGCTTATG TTG6CTCTCC CA6AGCAAAC AGCCAT6GAG GTGTTTGTCC 1260 

AAAGGCTTAT TATAGCCCAG TTGGGAAQAA AAATATCTAT TTGAATAGTG GTTTGAaSAG 1320 

OVCAAAGAAT TATGGTAAAA CCATCCTTAC AAAGGAAGCT GACCTGGTTA CAACTCATGA 1380 

ATTOGGACAT AATTTTG6AG CA6AACA30A TCCX3GATGGT CTA6CAGAAT GT60CG0SAA 1440 
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TGAGGACCftG GGftGGGAAAT ATGTCATGIA TCOCATABCT GTGAGIG60G ATCACGAGAA 1500 

CAATAAGAT6 TTTTCAAIlCr GCAGTAAAOV ATCAATCTAT AAGAOC A TTG AAAGTAAGGC 1560 

CCAGGAGTGT TTTCAAGAAC 6CAGCAATAA AGTTTGT6GG AACT0GAGG6 TGGATGAAGG 1620 

AGAAGAGTGT GATCCTGGCA TCATGTATCT GAACSUVOSAC ACCTGCTGCA ACAGCGACTG 1680 

CAOQTTGAAO GAAGGTGTCC AGTGCA6TGA CAGGAACAGT CXrTTGCTGTA AAAACT6TCA 1740 

GTTTGAGACT OCCCAGAAGA AGTGCCAGGA GGGCSATTAAT GCTACTT6CA AAG60STGTC 1800 

CTACT6CACA OGTAATAGCA GTGAtSTGCOC GCCTGCAGGA AAT6CTGAAG AT6ACACTGT 1B60 

TTGCrrGGAT CTTGGCAAGT GTAAGGATGQ GAAATGCATC CCTTTCTGCG AGAGGGAACA 1920 

GCAGCTGGAG TCCTQTGCAT GTAATGAAAC TGACAACTCC TGCAAGGTGT GCTGC3WX3GA 1980 

CCTTTCOGGC CGCT6TGTGC CCTATGTOSA TGCTGAACAA AAGAACTTAT TTTTGAGGAA 2040 

AGGMAGOCC TGTACAGTA6 6ATTTTGT6A CAT6AATG6C AAATSTGASA AAOGAGTACA 2100 

6GATGTAATT GAACXSATTTT GGGATTTCAT TGACCAGCTG AGCATCAATA Cri TT GG AAA 2160 

GTTTTTAGCA GACAACATOG TTGGOTCTGT CXrrGGTTTTC TCCTTGATAT TTTGQATTCC 2220 

TTTCAGCATT CTTGTCX31TT GTGTGTAAOG TOSAAATGCT GAGCAGCATG GATTCTGCAT 2280 

OGGrrOGCAT TATCAAACCC TTTCCTGOGC CCCAGACTCC AGGCCGCCTG CAGCCTGCCC 2340 

CTGTGATCCC TTCGG06CCA GCAGCTOCAA AACTG6ACCA CCAQAGAAT6 QACACCATCC 2400 

AGGAAGACGC CAGCACAGAC TCACATATGG AOGAGGATGG GTTTGAGAAG GACCCCTTCC 2460 

CAAATAGCAG CACAGCTGOC AAGTCATTTX3 AGGATCTCAC GGACCATCCG CTCACCA6AA 2S20 

GTGAAAAGGC TGCCTCCTTT AAACTGCAGC GTCAGAATCQ TGTTGACAGC AAAGAAACAG 25 BO 

AGTGCTAATT TAGTTCTCAQ CTCTTCT6AC TTAAGTGTGC AAAATATTTT TATAGATTTG 2640 

ACCTACAATC AATCACAOCT TATATTTTGT GAAGACTGGG AAGTGACTTA GCAGATGCTO 2700 

GTCATCTGTT TGAACTTCCT GCAGGTAAAC AGrTCTTGTG TGGTTTGGCC LTr CXtXrm ' 2760 

T6AAAAGGTA AGGTGAAGGT GAATCTAGCT TATTTTGAGO CTTTCAGGTT TTAGTTTTTA 2820 

AAATATCXTT TGACCTGTG6 TGCAAAAGCA GAAAATACAQ CTGGATTGGO TTATGAGTAT 2880 

TTAOGTTTTT GTAAATTAAT CTTTTATATT GATAACAQGC ACTGACTAGG GAAATGATCA 2940 

GTTTTTTTTT ATACACTGTA ATGAACCQCT GAATATGAAG CATTTGGCAT TTATTTGTGA 3000 

GAAAAGTGGA ATAGTTTTTT ' i ' m - nTril TTTTTTTTGC CTTCAACTAA AAACAAAGGA, 3060 

OATAAATTTA GTATACATT6 TATCTAAATT GTGGGTCTAT TTCTAGTTAT TACCCAGAGT 3120 

TTTTATGTAO CAGGGAAAAT ATATATCTAA ATTTAGAAAT CATTTGGQTT AATATGGCTC 3180 

TTCATAATTC TAAGACTAAT GCTCAGAACC TAACCACTAC CTTACAGTGA GGGCTATACA 3240 

TGGTAGCCAG TTGAATTTAT GGAATCTACC AACTGTTTAG GGCCCXX3ATT TGCTGGGCAG 3300 

TTTTTCTGTA TTTTATAAOT ATCTTCATGT ATCOCTGTTA CTGATAGGGA TACATGTCTT 3360 

AGAAAATTCA CTATT GG CTG GGAGTG6TGG CTCATG0CT6 TAATCOCAGC ACTTGGAGAO 3420 
3421 GCTQAG6TT6 OGOCACTACA CTCCAGOCTS GGT6ACA6AG TQAGATCT6C CTC 

Seq ID NO: 557 Protein sequence 
Protein AccesBion NP 066604.1 
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MRQSLLPLTS 


1 

WPPVLAPRP 


1 

PDDPGFGPEQ 


1 

RLBKLDSLLS 


1 

DYDILSLSKI 


QQHSVRKHBL 


60 


QTSTHVETZjIi 


7PSALKRHFK 


LYLTSSTERF 


SQNFKVWVD 


GKNESEYTVK 


WQDFFTGKW 


120 


6BPDSRVLAB 


ZRODDVZZRZ 


NTDGAEyNZB 


PLHRFVNDTK 


DKRNLVYKSB 


DZXKVSRLQS 


180 


PKV06YLKVD 


NBEUiPKGLV 


DREPPEELVH 


RVKBRADPDP 


MKNTCKLLW 


ADHRFYRYMG 


240 


RGEESTTTNY 


liIELIDRVDD 


lYRNTSWDNA 


GFKGYGIQIB 


QIRILKSPQS 


VKPGEKHYNM 


300 


AKSYPNEEKD 


AHDVK^fLII£Q 


PSFDXASEAS 


KVCLAHLFTY 


QDFDMGTIX3L 


AYVGSPRANS 


360 


HGGVCFKAYY 


SFVGKKNIYL 


NSGLTSTRNY 


GKTXLTKEAD 


LVTTHELGBK 


FGAEBDPDGIt 


420 


AECAFNEDQG 


GiorvMypzAV 


SGDHENtlKMP 


8NCSKQSXYK 


TZESXAQECP 


QBRSNKVOCBI 


480 


SRVDE6EBCD 


POXMyLHNDT 


OCNSDCTLKB 


6VQCSDRNSP 


0CKHC3QFETA 


QKKOQEAZNA 


540 


TCKGVSYCTG 


KSSBCPPPGN 


AEDDTVCLDL 


GKCKDGKCZP 


PCBRBQQIiBS 


CAQ9ETIMSC 


' 600 


KVCC31DLSGR 


CVPyVDAEQK 


HLFUIKGKPC 


TV6FCDMMQK 


CBKRVQDVZB 


RFWDFZDQIiS 


660 


INTFGKFLAD 


NIVGSVLVPS 


LZFWXPFSII« 


VHCV 









Seq ZD NO: 558 IBZA sequence 

MUdeic Acid Accession ti ZIM_004 994.1 

Coding sequence t 20 . . 2143 

1 11 21 31 41 51 

I I I I I I 

AGACACCTCT GCCCTCACCA TGAGCCTCTG OCAGCCCCTG GTCCTGGTGC TCCTG6T6CT 60 

GGGCTGCTGC TTTGCTGCCC CCAGACAGCG CCAGTCCACC CTTGTOCTCT TCCCTGGAGA 120 

CCTGA6AAGC AATCTCACOG ACAGGCAGCT GQCAGAGGAA TAOCTGTACC GCTAT6GTTA 180 

CACTCB08TG GCAGAGATGC GTQGAGAGTC OAAATCTCTG 66G0CTGCGC TGCTGCTTCT 240 

CCAGAA6CAA CTGTCCCT6C C0QAGACC60 TGAGCTGGAT A6CGCCACGC T6AAG6CCAT 300 

GCQAACCCCA OGGTGOGGGG TCCCAGACCT GGGCAGATTC CAAACCTTTG AGGGOGACCT 360 

CAAGTGGCAC CACCACAACA TCACCTATTG GATCCAAAAC TACTC3GGAAG ACTTGCXX^ 420 

GGCGGTGATT GAOGAOGCCT TTGCCCGCGC CTTOGCACTG TGGAQCOOGQ T GAOGO OGCT 480 

CACCTTCACT C60GTGTACA QCOQGQAC30C AOACATOOTC ATCCAGTTTQ GTGTOSGGGA 540 

GCA066AGAC GG6TATCCCT T06ACGGGAA G6ACG6GCTC CTGGCACAC6 OCTTTCCTCC 600 

TG6CXXX3QGC ATTCAQGGAG AOGCCCATTT OGAOGATGAC GAGTTGTGGT CCCTGGGCAA 660 

GGG0GTCX3TQ GTTCCAACTC GGTTTGGAAA OGCAGAT6GC GCGGCCT6CC ACITOCCCTT 720 

CATCTT06AG GGC06CTCCT ACTCTGCCTQ CACCACC3GAC GGT06CTCCX3 AOGOCTTGCC 780 

CT GO T G CAGT ACCAOQQOCA ACTAGGACAC OGAOSACOOQ TTTOGCTTCT GCOOCAGGaA 840 

GAGACTCTAC ACC06QGA06 GCAAT6CT6A TGGGAAACOC T6CCAGTTTC CATTCATCrr 900 

CCAAGGCCAA TCCTACTCOG CCTOCACCAC GGACGGT06C TC06ACGGCT ACC5GCTGGTG 960 

OGCCACCACC GCCAACTACG ACOGGGACAA GCTCTTCGGC TTCTGCCCGA CCOGAGCTQA 1020 

CTCGA0G6TG ATGGGGG6CA ACTCGG06GG GGAGCTGTGC GTCTTCCCCT TCACTTTCCT 1080 

6GGTAAGGAG TACTCGACCT GTACCAQGQA GGGCOSOGGA GATG6GG6CC TCT6GTGCGC 1140 

TA0CA0CT06 AACTTTGACA GOGACAA6AA GTGGGGCTTC TGCCCGQUrC AAGGATACAG 1200 

TTTGTTCCTC GTGGOGGCGC ATGAGTTC6G CCAOGCGCTG GGCTTAGATC ATTCCTCAGT 1260 

GCOGGAGGCG CTCATGTACC CTATGTACCG CTTCACTGAG GG6CCCCCCT TQCATAAGGA 1320 

OGAOGTGAAT G6CATCCGGC A0CTCTAT6G TOCTCXSCCCT GAACCTGAGC CA06GCCTCC 1380 

AAOCACCAOC ACAOOGCAGC CCAOSGCrCC C00QA0G6TC TQCCCCAC08 GACOCCCCAC 1440 

TGTCCACCCC TCAGA606CC CCACAGCTGG CCCCACAG6T CCCCCCTCAG CT06CGCCAC 1500 

AGGTCCCOOC ACTGCTGOCC' CTTCTAOGOC CACTACTGTO CXrTTTGAGTC CG6TGQA0GA 1560 

TGOCTGCAAC QTQAACATCT TOGACGOCAT OGOGGAGATT GGGAACCAGC TGTATTTGTT 1620 

CAAQGATGG6 AAGTACTGGC GATTCTCTGA GG6CAGGGGG AGCOGGOOQC AGGGCCCCTT 1680 
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OCTTAI060C GACftAgTGGC 0CGCGCT60C OOSCMSCIG GAC TOGG TCT TTSAOGAGOC 1740 

GCTCTCCAA6 AAGCTTTTCT TCTTCTCTGG 606CCAGGT6 TGGGIGTACA CAGG06CGTC 1800 

GGTGCTGGGC CCGAG6CGTC TGGACAAGCT 6GG0CTG8GA GCXXSUOTGO COCAGGTGAC 1860 

OGGGGCCCTC OSGAGTGGCA GGGGGAAGAT GCTGCT GT TC AGCGG6CGGC GCCTCTGGAG 1920 

GTTOSA06T6 A AGGOBC AGA T06TXSGATGC COSGAGCGCC AGCGAGGTG6 AC06GATGTT 1930 

GCCOBGGGTG OCITTGGACA OGCAGGAOCST CT7CCAGTAC GGAGAGMA6 OCT A TlTCitS 2040 

GCAGGA0C6C TTCTACTGGC G06TGAGTTC COGGAGTGAQ TTGAACCAGG TGGACCAA6T 2100 

GGGCTAOGTG ACCTATGACA TCCTGCAGTG CCCTGAOGAC TAGGGCTCCC GTCCTGCTTT 2160 

GCAGTGCCAT GTAAATCCCC ACTGGGACCA ACCCTGGGGA AGGAGCCAGT TTGCOGGATA 2220 

CAAACTCGTA TTCTGTTCTO GAGGAAAGG6 AGGAGTGGAG GTGGGCTGGG CCCTCTCTTC 2280 
TCAOCTTTGT TTTTTOTTGO AOTGTTTCTA ATAAACTTGG ATTCTCTAAC CTTT 

Seq ID NO: SS9 Protein sequence 
Protein Accession «t NP_00498S.l 

1 11 21 31 41 51 

I I 1 I I I 

MSLWQPLVLV LLVIX3CCFAA PRQRQSTLVL FPGDLRTMLT DRQLAEEYLY RYGYTRVAEM 60 

RGESKSLGPA LLLLQKQIiSL PETGELDSAT LKAMRTPRCG VPDLGKFQTF EGDLKKHHHN 120 

ITYHIQNYSE DLPRAVIODA FARAFALHSA VTPLTFTRVy SRDADIVZQF GVAEHGD6YP 180 

FDG KPGIi LAH APPPGP6ZQG XUHFDODELH SL6KGVWPT RFCSIADGAAC BFPFXPBGRS 240 

YSACTTDGRS DGLFWCSTTA NYDTDDRFGP CPSERLYTRD QUXXSKPGQP PFIFQGQSYS 300 

ACTTDGRSDG YRWCATTANY DRDKLFGPCP TRADSTVMGG NSAGBLCVPP PTPLGKBYST 360 

CT8BSRGDGR LKCAT7SNFD SDKKWGPCPD QGYSLPLVAA HEFGHALGLD HSSVPEALKY 420 

PMYRFTE6PP LHKDDVNGIR HLYGPRPEPE PRPPTTTTPQ PTAPPTVCPT GPPTVHPSER 480 

PTA6P7GPPS AGPTGPPTAG PSTATTVFLS PVDDACNVKI FDAZAEZGKQ LYLFKDGKYN 540 

RPSEGRGSRP QGPFLZADXW PALPRXLDSV FEEPZiSKKLP FFStStQVHVY T6ASVLGPRR 600 

U}KL6U»DV AQVTGALRSG R6KHLLFSGR RU9RFDVKAQ MVDraSASSV DRMFFGVPLD 660 
TRDVFQYREK AYFOQDRFYH SVSSRSBUIQ VDQV8YVTYD ILQCPED 

Seq ID KOt 560 DNA sequence 

Nucleic Acid Accessioa 8: MM_000213.1 

Coding sequence: 127.. 5385 

1 11 21 31 41 51 

1 I I I I I 

CGCCaSCGOG CTGCAGCCOC ATCTCCTAGC GGCA6CCXAG GOGCGGAGGO AGOGAGTCCG 60 

CCCCGAGGTA GGTCCAGGAC G6GGGCACAG CAGCAGCCXSA GGCTGGCOGG GAGAQG6AG6 120 

AAGAGGATGG CAGGGCCACO CCCCAGCCCA TGGGCCA6GC T6CTCCTG6C A6CCTTGATC 180 

AGCGTCAGCC TCTCTGGGAC CTTGGCAAAC CGCTGCAAGA AOGCCCCAGT GAAGAGCTGC 240 

AOGGAGTGTG TCC3GTGTGGA TAAGGACTGC GCCTACTGCA CAGACGAGAT GTTCAOGGAC 300 

CX^GOGCTGCA ACACCCAGGC GGAGCTGCTG GC0G06GGCT GCCA6CGGGA GAGCATCGT6 360 

GTCATGGAGA 6CA6CTTCCA AATCACAGAG GAGACCCAGA TTGACACCAC CCTG0G60GC 420 

AGCCAGATGT C CCCO CAAGQ OCTGOSGGTC 05TCTGCXJGC OOGGTGACGA GCGGCATTTT 480 

GAGCTGGAGG TGTTTGAGCC ACTGGAGAGC CCGGTGGACC TGTACATCCT CATGGACTTC 540 

TCCAACTCCA TOTCCGATGA TCTOGACAAC CTCAAGAAGA TOGGGCAGAA CCT60CTC3GG 600 

GTCCTGAGCC AGCTCACCAG OGACTACACT ATTGGATTTG GCAAGTTTGT GGACAAAGTC 660 

AGGGTCCOGC AGAGQ6ACAT GAGGCCTGA8 AA0CTGAAG6 AGOCCTGGCC CAACAGTGAC 720 

CCCCCCTTCT CCRGAAGAA OGTCATCAGC CTGACAGAAG ATGTGGAT6A G TTCO O GAAT 780 

AAACTGCAQG GAGAGOSGAT CTCAG6CAAC CTGGATGCTC CTGAGGG06G CTT06ATGCC 840 

ATCCTGCAGA CAGCTGTGTG CAOSAGQGAC ATTQGCTGGC GCCCGGACAG CACCCACCTG 900 

CTGGTCTTCT CCACCGAGTC AGCCTTCCAC TATGAGGCTG ATGGOGCCAA CGTGCTGGCT 960 

G6CATCATGA GC06CAACGA TSAAGGGTGC CACCTGGACA CCAOGGGCAC CTACAOCCAG 1020 

TACAGGACAC AGQACTACCC GTOGGTGCCX: ACCCTGGT6C G0CTGCT06C CAA6CACAAC 1060 

ATCATCOCCA TCTTTGCTGT CACCAACTAC TCCTATAGCT ACTAOGAGAA GCTTCACACC 1140 

TATTTCCCTO TCTCCTCACT GQGGGTGCTG CAGGAGGACT CGTCCAACAT CGTGGAGCTG 1200 

CTGGAQGAGG CCTTCAATCG GATCC6CTCC AACCTOQACA TCOGGGCXXJT AGACAGCCCC 1260 

C6AG6CCTTC GGHCAQAOGT CACCTCCAAG ATGTTCCAGA AGACGAGGAC TGGGTCCTTT 1320 

CACAT0C3QGC G6GG66AAGT GGGTATATAC CAG6T6CAGC T60GGGCCCT T6AGCA0GTG 1360 

G ATGQ GAOGC ACGT6T0CCA GCTGCOGGAG GACCAGAAGG GCAACATCCA TCTGAAACCT 1440 

TCCTTCTCaS AOQGCCTCAA GATGGAOGOS GGCATCATCT GTGATGTGTG CACCTGCGAG 1500 

CT6CAAAAAG AGGTGCGGTC AGCTCX3CTGC AGCTTCAAOO GAGACTTan* OTQCGGACAO 1560 

TGTGTGTGCA G06AGGGCTG GAGTOQCCAG ACCIGCAACT OCTCCACOQO CTCTCTGAGT 1620 

6ACATTCA0C CCTGCCTGCG GGAQGGG6AG GACAAOC08T GCT C OGGCOO TG066AGTGC 1680 

CAGTGCGGGC ACT6TGTGTG CTACGGOGAA GGCOGCTACXS AGGGTCAGTT CTGCGAGTAT 1740 

GACAACTTCC AGTGTCCCOG CACTTCC3GGG TTCCTCTGCA ATGACCGAGG ACGCTGCTCC 1800 

ATGGGCCAGT GTGTGTGTGA GCCTGGTTGG ACA6GCCCAA GCTGTGACTG TC0CCTCA6C 1860 

AATGCCAOCT GCATOGACAO CAATOOGGGC ATCTGTAAIIG GACGTGGOCA C1GT8AGTGT 1920 

GGCCGCTGCC ACTOOCACCA GCAGT06CTC TACAC6QACA CCATCTG06A 6ATCAACTAC 1980 

TOGGOGATCC A0CCX3GGCCT CTGCGAGGAC CTAOGCTCCT GOGTGCAGTO CCAGGOOTGG 2040 

GGCACCGGCG AGAAGAAGGG . G06CA0GT6T GAGGAATGCA ACTTCAAGGT CAAGATGGTG 2100 

GACGAGCTTA AQAGAGCCGA GGA6GTGGTG GTG0SC216CT CCTTCOSGGA 0GA6GATGAC 2160 

GACT6CA0CT ACAGCTACAC CATGGAAGOT GA0QQCX2CCC CTO G GCC C RA CAGCACTGTC 2220 

CTGGTGCACA A6AAGAAGGA CTGCCCTCC6 GGCTCCTTCT GOTGGCTCAT CCCCCTOCTC 2260 

CTCCTCCTCC TGCC3GCTCCT G6CCCTGCTA CTGCTGCTAT GCTGGAAGTA CTGTGCCTGC 2340 

TGCAAGGCCT GCCTGGCACT TCTCCCGTGC TGCAACCGAG GTCACATGGT GGGCTTTAAG 2400 

GAAGACCACT ACAT6CTG06 GGAGAACCTO ATGGCCTCTG ACCACTTGGA CA060CCATG 2460 

CTG0GCA6G6 GGAAOCTCAA GQGC0GT6AC GTGGTCOGCT GOAAGGTCAC CAACAACATG 2520 

CAGCGGCCTG GCTTTOCCAC TCATGCOGCC AGCATCAACC CCACAGAGCT G G T G CCCTAC 2580 

GGGCTGTCCT TGCGCCTGGC CCGCCmGC ACCGAGAACC TGCTGAAGOC TGACACTOGG 2640 

GAGTGOGCCC AGCTGCDGCCA GGAGGTGGAG GAGAACCTGA AOSAGGTCTA CAGGCAGATC 2700 

TCOSGTGTAC ACSAGCTCCA GCAGACCAAG TTCOGGCASC AGCOCAATGC CGGGAAAAAG 2760 

CAAGACCACA CCATTGTGGA GACAGTGCTO ATGGGGCCGC QCTGGGCCAA 6CC6GCCCTG 2820 

C31QAAGCTTA CAGA6AA6CA 0GTG6AACAG AG66CCTTCC AC3QACCTCAA GGTGOCCCCC 2880 

GGCTACTACA CCCTCACTGC AGACGAGGAC GCCOGGGGCA TGGTGGAGTT CCAGGAGQOC 2940 

GTXSGAGCTGG TGGAOGTAOG GGTGCCCCTC TTTATCCGGC CPGAGGATGA OGAOQAQAAG 3000 

CACCTGCTGQ 7GGAG6CCAT CGACGTGCGC GCAGGCACTG CCA0CCTC6G CCGCGGOCTG 3060 
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wo 02/086443 

GTAAACATOV CCATCATCAA OGAGGAAOOC A8AGACGTGG TCTCCIYIGA, GCAGCCTGA6 3120 

irCTOG Q TOl GOOGCQG G GA CCAGCTQGCC CSGCKTCCCTG TCATCCGGGG TGTOCTG G AC 31B0 

GQCGGGAAGT CCCA6GTCTC CTACOGCACA CAGGATGGCA COGCGCAGGG CAACOGGQAC 3240 

TACATCCCOO TGGAGGGTGA GCTGCTGTTC CAGCXTGGGG AGGCCTGGAA AGAGCTGCAG 3300 

GT6AAGCTCC TGGAGCTGCA AGAAGTTGAC TCCCTCCT6C GGGGC06CCA GGTCOGCCGT 3360 

TTCC A OGTCC AGCTGAGCAA COCIAAGTTT OGGGCCCftOC TQGGCCAGOC OCACTCCACC 3420 

ACCATCATOl TCAGGGAOCC AfiATGAACIO GACOGQAGCT TCAOGAGTCA GATGTTGTCA 3480 

TCACAGCCAC CCCCTCAOGO OGACCTGGGC GCOCOGCAGA ACCCCAATGC TAAGGOCX5CT 3540 

GGGTCCAGGA AGATCCATTT CAACTGGCTG OCCCCTTCTG GCAAGCCAAT QQGGTACAGG 3600 

GTAAAGTACT GGATTCAGGG TGACTOOGAA TCGGAAGOCC AOCTGCTOGA CAGCAAGQT6 3660 

CCCrCAGTGG AGCTCACCAA C3CTGTACC06 TATTGOQACT AIGASATOAA GGTGTGOGOC 3720 

TAC3GGG6CTC AGGG06A666 ACCCTACAGC TCGCTGGTGT 0CTGC06CAC CCAGCAGGAA 3780 

GTGCCCAGC3G AGCCAGGGOG TCTGGCXnTC AATGTOGTCT CCTCCAOGGT GACXX:AGCTG 3840 

AGCT6GGCTQ AGC06GCTGA GACCAAOGGT GAGATCACAG CCTAOGAGGT CTGCTATGGC 3900 

CTGGTCAA06 AT6ACAAC06 ACCTATTGGG CCCATGAAGA AAGTGCTGGT TGACAACCCT 3960 

AAGAACOGGA TGCIGCTTAT TGAGAAGCTT GG6GAGTCCC AGCCCTAC06 CTACA0G6TG 4020 

AAG60GCGCA A0GGGGCCXX3 CTGGGGGCCT GAG0QGGAG6 CCATCATCAA CCTGGCCACC 4080 

CAGCCCAAGA GGCCCATGTC CATCCCCATC ATCCCTGACA TCCCTATCGT GGACGCCCAG 4140 

AGOSGGGAGG ACTAC3GACAQ CTTCCTTATG TACAGOGATG AOGTTCTAOG CTCTCCATaS 4200 

GGCAGCX:AGA GGCCCAGCGT CTCCGATGAC ACTGAGCACC T6GT6AAIGG COGGATGGAC 4260 

TTTGCCTTCC OGGGCAGCAC CAACTCCCTO CACAOGATSA 0CACGACCA6 TGCTGCTGCC 4320 

TATGGCACCC ACCTGAGCCC ACAOGTGCCC CACOGCGTGC TAAGCACATC CTOCACCCTC 4380 

ACACGGGACT ACAACTCACT GACC06CTCA GAACACTCAC ACTCGACCAC ACT6CCGAGG 4440 

GACTACTCCA CCCTCACCTC OGTCTCCTCC CAOGACTCTC GCCTGACTGC TGGTGTGCCC 4500 

GACAOGCCCA CCCGCCTGGT GTTCTCTGCC CTGGGGCCCA CATCTCTCA6 AGTGAGCTG6 4560 

CAGGAGCCGC 6GTGCGAG00 GCOGCTGCAG 6GCTACAGTG TGGAGTACCA GCTGCTGAAC 4620 

GGCGGTGAGC TGCATCGGCT CAACATCCCC AACXXTTGCCC AGACCTOGGT GGTGGTGGAA 4680 

GACCTCCTGC CCAACCACTC CTACGTGTTC OGCOTGOGGG CCCA6AGCCA GGAAGGCTGG 4740 

GGCCGAGAGC GTGAGGGTGT CATCACCATT GAATCCCAGG TGCACCOGCA GAGCCXIACTG 4800 

TGXXMCCTGC CAGGCTCOGC CTTCACTTTO AGCACTCCCA GTGCCCCAGG CCOQCTGGTG 4860 

TTCACTGCCC TGAGCCCAGA CTOGCTGCAG CTGAGCTGGG AGOSGCCAOG GAGGCCCAAT 4920 

GGGGATATOQ TCGGCTACCT GGTGACCTGT GAGAT06CCC AAGGAG6AGG GCCAGCCACC 4980 

GCATT0CX»3G TGGAT6GAGA C3UKX00GAG AGCCGGCTGA C06TGC0GGG OCTCAGCGAG 5040 

AAOGTGOOCT AGAAGTTCAA GCTTGCAGGCC AGGACCACTO AGGGCTTCGG GCCAGAGCGC 5100 

GAGGGCATCA TCACCATAGA GTCCCACQAT GGAQGACCCT TCCCGCAGCT GGGCAGCC3GT 5160 

6006GGCTCT TCCAGCACXX: QCTGCAAAGC GAGTACAGCA GCATCACCAC CACCCACACC 5220 

AG06CCAC06 AGCCCTTCCT AGT6GATGGG CCGACCCTGG GGGCCCAGCA CCTGGAGGCA 5280 

GGOGOCTCCC TCA000G6CA T6TGACCCA6 GAGTTTGTGA GC06GACACT GACCACCAGC 5340 

GGAA O CCTTA GCAOCCACAT GQIOCAACAG TTCTTCCAAA CTT6AC0GCA CCCTGCCCCA 5400 

CCCCOGCCAT GTCCCACTAG GCGTCCTCCC GACTCCTCTC CC66AGCCTC CTCAGCTACT 5460 

CCATCCTTGC ACCCCTGGGG GCCCAGCCC3V (XCGCATGCA CAGAGCAGGG GCTAGGTGTC 5520 

TOCTGGGAOG CA7GAAGGGG GCAAGGTCOO TCCTCTGTGG GOCCAAACCT ATTTGTAACC 5560 

AAAGAGCtGQ 6AGCA0CACA AGGACCCAOC cmXilTCm C3U:TTAATAA ATGGTTTTOC 5640 
TACTG 

Seq ID NO: 561 Protein sequence 
Protein Acceseion ft: KP_000204.1 

1 11 21 31 41 51 

I I I I- I I 

MAGPRPSPWA RLUiAALISV SLSGTLANRC KKAPVKSCTB CVRVDKDCaY CTDEMFRDRH 60 

CNTQAELLAA 6CQRBSIWM BSSFQITEBT QZDTTIiRRSQ HSPQGLRVRL RPGBERHFEZi 120 

EVFEPIiESPV DLYIXM3FSN SMSDDLDmiX KMGQNLARVL SQLTSDVTIG FGKFVDXVSV IBO' 

FQTDMRPEKL KBPWFNSDPP FSFIQIVISLT EDVDEFRHXL QGERISOIU) APB60FDAZL 240 

QTAVCTRDIG WRPDSTHLLV FSTESAFHYB ADGANVLAGI MSRNDBRCHL DTTGTYTQYR 300 

TQDYPSVPTL VRLLAKHMII PIFAVTNYSY SYYEKLHTYP PVSSLGVLQB DSSNIVELLB 360 

EAFMRIRSNL DIRALDSPRG LRTEVTSKMF QKTRTGSFHI RKGEVGIYQV QLRALEKVZX; 420 

THVCQLPEDQ KBITIKliKPSF SDGLKKDAOI IGDVCTCELQ KEVRSARCSF 2n3)FVCGQCV 480 

CSBGHSGQTC HCSTGSLSDI QPCLRBGBDK PCSGRGEOQC GHCVCYGEGR YEGQFCEn>N 540 

FQCPRTSGFL CNDRGRCSMG QCVCBPGWTG PSCDCPLSKA TCIDSNGGIC NGR6HCB0GR 600 

CHCHQQSLYT DTICEINYSA IHPGLCEDLR SCVQOQAWGT GEKKGRTCEE OJPKVKMVDE 660 

LKRAEEWVR CSFRDEDDDC TYSYTMEQDO APGPNSTVLV HXKKDCPPGS FWWLIPLLLL 720 

LLPLLALLLL IiCmCYCACCK ACLALLPGCai RGHMVGFKED BYKLRENLMA SDRLDTFMIiR 780 

SGNLRGRDW macmOMQR P6FATHAASI NPTELVPYGL SLRLARLCTE NLLKPDTRBC 840 

AQLRQEVEEN UIEVYRQISG VRKLQQTKFR QQPNAGKRQD HTIVDTVLMA PRSAKPALLK 900 

LTEKQVEQRA PHDLKVAPGY YTIiTADQDAR <24VEPQEGVB liVDVRVPLFI RPBDDDEKQL 960 

LVEAZDVPAG TATL6RRLVN ITIZKEQARD WSPEQPBFS VSRGDQVARX PVIRRVIiDGG 1020 

KSQVSyRTQD GTAQGNSDYX PVEGBLLFQP GEAKRBLQVK hLBUOBVDBh LRGRQVBRFB 1080 

VQLS2TPKF6A HLQQPHSTTI ZZSDPDEIiDR SFT8QMLSSQ PPPBGDL6AP QNFNAXCAA6S 1140 

RXIHFUWLPP SGKPMGYRVK YWIQGOSESE ARLZiDSXVPS VELTNLYPYC DYEKKVCAYO 1200 

AQGEGPYSSL VSCRTHQBVP SEPGRLAPNV VSSTVTQLSW AEPAETNGEI TAYBVCYGLV 1260 

NDDNRPZGPM KKVLVDNPKN RMLLIBNLRE SQPYRYTVKA RMQAGWGPER EAZZNLATQP 1320 

KRFMSZPZZP PIPZVDAQSG EDYDSFLMYS DSVIiRSPSOS QRPSVSOZyTB BLVN6RMDFA 1380 

FPGSTNSLHR MTTTSAAAYG TBLSPHVFHR VLSTSSTLTR OYNSLTRSEH SRSTTLPRDY 1440 

STLT5VSSHD SRLTAGVPDT PTRLVFSALG PTSLBVSWQE PRCERPLQGY SVEYQIiLHGG 1500 

ELHRUaPt^ AQTSVWEDL LPNHSYVFRV RAQSQBGW6R ERBGVITIES QVHPQSPIiCP 1560 

LPGSAFTLST PSAPGPLVFT ALSPDSLQLS NERFRRFN6D ZVOYLVTCEM AQGGGPATAF 1620 

RVDQ>SPBSR LTVP6LSBIV PYKFXVQART TEGFGPEREG ZXTZESQDG6 PFPQLGSRAG 1660 

LFQHPLQSEY SSITTTHTSA TEPFLVD6FT LGAQHIiSAGG SLTRBVTOGF VSR11.TTSGT 1740 
LSTEMDQQPP QT 

Seq ID KO: 562 DNA sequence 

nucleic Acid Accession i: Hi4_013332.l 

Coding sequence : 1 . . 63 

1 11 21 31 41 51 

I I I I I I 



399 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

GOIOGAGQGC GCTTTTGTCr COGGTGAGTT TTGT O GCOGG 
AGXAACGBAC TTTCClt'OGG ACTCCT6CAC GAOCTGCTOC 
OGGCTGTTCC CCOOGAGGGT CCAGAGGCCT TTCAGAAGGA 
GCAGAGGAGT AGGGTCXTTTT CAGCCATGAA GCATX3TGTTG 
GGTACTGACC CTACTCTCCA TCTTG6TTA6 AGTGATG6A6 
GA6CCCAT08 CCTGQGACCT CCTGQACCAC CAGAAGCCAA 
CAAG6GOCTT GCAGACCATC CATCCAGAA6 CATGTGATAA 
ATATTTTGGA ACACTGACCT AGACATGTCC AGATGGGAGT 
TGAGCACCGT TGTAACC3VGA GAACTATTAC TAGGCCTTGA 
CTCATTGCCT GG6CAAGGCC TGTTTAG6CC GGTTGGGGTQ 
CACTTTGGGA GGCTGAOGIG 6GTC36ATCAC CTGAQOTCAG 
CAACATG60CS AAACCCCATC TCTACTAAAA ATACAAAA6T 
GGCCTGTAAT CCCAGTTCCT TGGGAG6CT0 AGGOGGGAGA 
GAG6TTGCAG TGAACCGAGA TCXXACTGCT GTACCCAGCC 
CATCTCAAAA AAAAAAAGAA AAGAAAAAGC CT6TTTAATG 
TTATGGCTAT GAGATAGGTT GATCTOGCCC TTACCXXSGOQ 
TCCTCM5CA6 TATGGCTCTG ACATCTCTTA GATGTOOCAA 
TGATATTTTC AACCCTACTT CCTAAACATC TGTCTGGGGT 
TATGCTC3VAT TATTTGGTGT TCACOCTCTC TTCCACAAGA 
CAG7TGAAGA GGTTGTGTGG GTGGGCTGTT GGGAGTGAGG 
TTCTCATTTT ACATTTTAAA GTOGTTCCTC CAACATAGT6 
GG7GG6ATGC CAAAGCCTGC TCAAOTTATG QACATTGrTOO 
TTTTTTCTAA CTAATAAAGT G6AATATATA 7TTCAAAAAA 

Seq ID NOi 563 Protein sequence 
Protein Accession #: NP_03 7464.1 

21 



PCT/US02/12476 



AAGCTTCTGC 
TAOU3C0G0C 
GAAG6CA6CT 
AACCTCTACC 
TCCCTAGAAa 
CTAGCCAACA 
GACCTCCTTC 
CCCATTCCTA 
AGAACCTGTC 
GCTCATGOCT 
6A0TTCSGAGA 
TAGCTGGGTG 
ATTGCTTGAA 
TX^GCCACAG 
CACAGGT6T0 
OTCTOGTgTA 
CTTOUSCTGT 
TCCTTTAGTC 
GCTCCTCCAT 
ATGGAGTGTT 
TGTATTGQTC 
CCACCATGTG 
AAAAAAAAAA 



GCTGGTGCIT 
6ATCCACTCC 
CTGTTTCTCT 
TGTTAGGTGT 
GCTTACTASA 
OUaAGCCQlC 
CATACTGGCC 
GCAGACAAGC 
TAACTGGATG 
GTAATCCTAG 
CCAGOCTOGC 
TGGTGGCAGA 
CCOGGGGAOG 
TGCAAGACTC 
AGTGGATTGC 
TGCTGTGCTT 
TGGGAGATGG 
TTGAATGTCT 
GTTTGGATAG 
CAGT6C0CAT 
TGAAGOGQGT 
GCTKAAATGA 
AA 



51 



11 21 31 41 

I I i I I I 

MKHVUTLYLL 6WLTLLSIP VRVMBSLEGL LESPSFGTSH TTRSQLANTE PTKGLPDHPS 
HSK 

Seq ID KO: 564 D2IA sequence 

Nucleic Acid Accession ft: HM_023915.1 

Coding sequence: 250.. 1326 



G6CA0GAGG6 
TCAAAGCTTA 
6TGAATGGAC 
CCCACGCCTC 
AACIGAA6AA 
CAASA8A6TC 
AATGAATTTG 
TTGCTGAAT6 
TTCTATCTCA 
ATAGTCCATO 
TCAGTTTTGT 
GATCGCTATC 
ACGAAGGTTT 
.ATGCTGACAA 
CCTTTGOGGG 
GTGCTGGTGA 
AGGCAATTCA 
GTGGCTGTGT 
AGTCACTTAG 
ATTAOIGTTT 
T6TAGG1CAT 
ATCASATCAC 
GTGTAGGCCT 
TTCATTATCC 



11 

1 

TTTOGTTTTC 
TTCTTAATTA 
AGCCAGCCAC 
AATC6TCCCC 
TGQGGTTCAA 
ACAATTCAG6 
ACACAATTGT 
GTTTAGCAOT 
AAAACATAGT 
AT6CAGGATT 
TTTATGOVAA 
TGAAGGTGGT 
TATCTGTTTG 
ATG6TCAGGC 
TCAAATG6CA 
TTCTGATCGG 
TAAGTCA6TC 
TTTTTACCTG 
ACAGGCTTTT 
TCTTGTCTOC 
TTTCAAGAAG 
TGCAAAGTGT 
TTTATTGTTT 
TTAAAAAAAA 



21 
I 

ATQCTTTACC 
GASAOVAGAA 
CACAATGAAA 
AAGTQTTTCC 
CTTGAC6CTT 
CAACAGGAGC 
CTTGCCGGTG 
GTGGATCTTC 
GGTTGCAGAC 
T06A0CTTGG 
CATGTATACT 
CAAOCCATTT 
TGTTTGGGTG 
AACACSAGGAC 
TAOGOCAGTC 
ATGTTACATA 
AAGCOGAAAG 
CTTTCTACCA 
AGATQAATCT 
GTGTAATGTT 
GCTGTTCAAA 
GAGAAGATCG 
GTTGGAATOQ 
AA 



31 
I 

A6AAAATCCA 
AOCTGTTTCA 
GAAATCAAAC 
TGACAOGCAT 
GCAAAATTAC 
QAOGGGCCAG 
CTTTATCTCA 
TTCCACATTA 
CTCATAATGA 
TACTTCftAGT 
TCCATOGTGT 
GGGGACTCTC 
ATCATGGCTG 
AATATCCAT6 
ACCTATGTGA 
6CCATATCCA 
OGAAAACATA 
TATCACTTGT 
GCACAAAAAA 
T6CCTGGATC 
AAATCAAATA 
GAAGTTOGCA 
ATA1QTACAA 



41 

I 

CTTCOCTGCC 
ACTTGAAGAC 
CASGAATAAC 
CTTTGCTTAC 
CAAATAA08A 
QAAAGAACAC 
TTATATTTGT 
GGAATAAAAC 
OGCTGACATT 
TTATICTCTG 
7CCTTGGGCT 
GGATGTACAG 
TTTTGTCTTT 
ACTGCTCAAA 
ACABCTGCTT 
G6TACATCCA 
A0CA6A0CAT 
GCA6AATTCC 
TCCTATATTA 
CAATAATTTA 
TCAGAACCAG 
TATATTATGA 
AG1GTAAATA 



51 
I 

GACCTTAGTT 
ACCGTATGAG 
CTATGCTGAA 
AGTGCATCAC 
GC7GC310GGC 
CACOCTTCAC 
GGCAA6CATC 
CAGCTTCATA 
TCCATTTOGA 
CAGATACACT 
GATAAGCATT 
CATAACCTTC 
GCCAAACATC 
ACTTAAAAGT 
8TTTG1G6CC 
CAAA TCCHGC 
CAGGGTTGTT 
TTTTACTTTT 
CTGCAAAGAA 
CTTTTTCATG 
GAGTOAAAGC 
TTACACTQAT 
AATGTTTCTT 



Seq ID NO: 565 Protein sequence 
Protein Accession 6: KP 076404 



1 11 21 31 41 51 

I I I 1 1 I 

MGFNLTLAKL P2INELHGQES BKS6KRSDGP GKMTTLHNEF DTZVLPVLYL IIPVASILIN 
GLAVWIFFHI RNKTSFIFYL KNIWADLIM TLTFPFRIVH DAGPGPWYFK FILCXYTSVL 
PYANMYTSIV FLGLISIDRY LKWKPPGDS RMYSITFTKV LSVCVWVIMA VLSUPNIILT 
H6QPTEDKIB DCSKLKSPLG VKHBTAV7YV NSCLFVAVLV ZLXGOriAIS RYIHKSSRQF 
I8QSSRKRKB HQSZRVWAV FFTCFLPYHIi CRIPFTPSBL DRLLOESAQK ILYYCKBITIi 
FLSACMVOiD PIIYPFMCRS PSRRX.PKKSI7 ZRTRSESZRS U1SVRR8EVR ZYTDYTOV 

Seq ZD NO: 566 DNA Bequence 

Nucleic Acid Accession ft: NM_005365.1 

Coding sequence: 1..948 



1 

1 

ATGTCTCTCX3 
GA0GACTTG6 
TCCTCTGACA 
CCTCAGG6AG 
GAGGGCTCCA 
GAGTTCATGT 



11 

1 

AGCAGAGGA6 
GCCTGATQQQ 
6CAA0GAG6A 
GOGCTTCCTC 
GCAGTCAAGA 
TCCAAGAAGC 



21 
I 

TC06CACTGC 
•TGCACASQAA 
GGA6GTGTCT 
CTCCATTTCC 
AGAGGAAGAG 
ACTGAAATTG 



31 
1 

AAGCCTGATG 
CCCACAG606 
GCT G CT GGG T 
GTCTACTACA 
CCAAOCTCCT 
AAGGTGGCTG 



41 
I 

AAGACCTTGA 
AGGAGGAGGA 
CATCAAGTOC 
CTTTATGGA6 
OQGTCQACCC 
AGTTGGTTCA 



51 

I 

AGCCCAAGGA 
GACTACCTCC 
TCCCCA6AGT 
OCAATTOGAT 
AGCTCAGCTG 
TTTCCT6CTC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



€0 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
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WO 02/086443 

CACAAAIA3C GAGTCAACGA GC06GTCACA AA08CAGAAA 
AATTACAAGC GCTACTTTOC TOTQATCTTC GQCAAAGCCT 
TTTGGCACTG ATGTGAAOGA GGTGGACCCC GCOGGCCACT 
CrrGGCCTCT OGTGCGATAQ CATGCTGGGT GATGGTCATA 
CIG^VTCATTG TCCT6GGT6T GATCCTAACC AAAGACAACT 
TQGGAAG06T TGA6TGTGAT GGGGGTGTAT GTTGGGAAiGG 
CCCAG6AAGC T6CTCACCCA AG A TT G G G T G CAGGAAAACT 
CCOGGCAGTG ATCCTGCGCA CTAOOAGTTC CTCTGGGOTT 
AGCTATGAGA AGGTCATAAA TTATTTGGTC ATCCTCAATG 
CCATCCCTTT ATGAAGAG6T TTTGGGAGAG GAGCAAGAGG 

Seq ID HO: S67 Protein sequence 

Protein Accession NP_0053S6.1 



PCT/US02/12476 



11 
I 



21 



MSLEQRSPEC KPDEDLEAQ6 EDLGLMGAQB 
PQGGASSSZS VYYTLWSQFD BGSSSQEEBB 
HKYRVKEPVT KAEMLBSVIK NYKRYFPVIP 
LGLSCDSMLG DGHSMPKAAL LIIVLGVILT 
PRKLLTQDHV QEKYLEYRQV PGSDPABYEP 
PSLYEEVLGE EQEGV 



31 

I 

PT6EEEETTS 
PSSSVDPAQL 
GKASEFMQVI 
KDNCAPEEVZ 
UHGSKABAET 



TGCTOQAGAG 
COtAGTTCAT 
CCTACATCCT 
GCATGCCCAA 
GCGC00CT6A 
AGCACATGTT 
ACCTQQAOTA 
CCAAG6CCCA 
CAAGAGAGCC 
GAGTCTGA 



41 
I 

SSDSKEEEVS 
EFMPQEALKL 
FGTDVKEVDP 
WEALSVMGVY 
SYEKVINYLV 



08TGATGAAA 
6CA6GTGATC 
TGTCACTGCT 
GGCOGCCCTC 
AGAGGTTATC 
CTAOOQGGAG 
00GGCA6GTG 
CGCTQAAACC 
OWTCTGCTAC 



51 
I 

AAGSSSPPQS 
KVAELVHFLL 
AGHSYILVTA 
VGKEKMFYGE 
MXMARBPXar 



Seq ID KOs 568 DNA sequence 
Nucleic Acid Accession Ui HM_014400 
Coding sequence! 86.. 1126 



GGTTACTCAT 
GACXSCCAAGG 
GATCTGGACT 
GTGCTACAiQC 
GAAGTG0GCX3 
CGGACAATTC 
CG6CCTGGAT 
CT6CAACX*CC 
ATA0CC3SC0C 
6GGTACAT06 
CTTOGA06GC 
CTGTGTCCAG 
TGGCT0CT6T 
OCCTOGAATC 
CACATCTGTC 
GCCA6GGCCA 
G6A6CCCAGG 
TCCTGCAAAA 
ATTGGCAGGC 
AAATTTOCCT 
CCCACCACTG 
CTTCTGCTGC 
GG6TGTTCTA 
TCCTCTTOTO 
AG6ATGCTAA 
GGTGGGACAA 
ATC3GGTTCCC 
CTTATGTCTG 
TT8TATAGTG 



11 

I 

CCTGGGCTCA 
GAGCAGGACO 
GCAQGCTGGC 
T60GTGCAGA 
CCG6GG6TG6 
TOGCTGGCAO 
CTTCAOGGGC 
AAGCTCAACC 
AAG8G0GTGG 
C0G00GGT06 
AAOSTCACCT 
GATGAATTCT 
TQCCAGGGGT 
OCAOCCCTTG 
ACCACTTCTA 
ACCAGTCAGA 
TTGACTGGAG 
6G0GGGCCCC 
CTTCTGTTGO 
CTGACCTACT 
GACTGGGCTG 

GCTTTTTGAG 
ATGTTAOGAC 
OCTTCCTACr 
TGGCTCCCCA 
CATATGTCTT 
TGTGTXaATCA 



21 
I 

ggtaagaggo 
gagcx:atgga 

TGCTGCTGCr 
AAGCAGATGA 
ACGTCT6CAC 
TGCSGGGTTG 
TTCTGGOGTT 
TCACCT0606 
AGTGCTACA6 
TSAGCTGCTA 
T6A0GGCA6C 
GCACTC6GGA 
CC0GCT6TAA 
TC0G6CT00C 
CCTOGGCCOC 
CTCCX3AQACA 
GOGCCGCTGG 
AGCAGCCCCA 
C06TGGCT6C 
TCTCTCGOCC 
GCCCAGCCCC 
GGCTTTGGGA 
GACAGCTGCT 
AOAGTGAGAG 
CACTTTCTCC 
CTCTAAGCAC 
CCTTACTAGA 
GTTTCTGGCA 



Seq ZD MO; 569 Protein sequence 
Protein Accession #: NP 055215 



1 
1 

MDPARKAGAO 
CTEAVGAVET 
SRALDPACaSE 
AAHVTVSLPV 
IiPPPEFTTVA 
AGHQDRSNSG 



11 

I 

AMIWTAGWUt 
IHGQFSLAVX 
SAYFFNGVBC 
RGCVQDEPCT 
STTSVTTSTS 
QYPAKGGPQQ 



21 
I 

UiLLRGGAQA 
GCGSGLPGKN 
YSCVGLSBEA 
RDGVTCTGFT 
AFVRPTSTTK 
PHNK6CVAPT 



Seq ID KO: 570 DNA sequence 
Kucleic Acid Accession fti IIM_005329.1 
Coding sequence: 1..1662 " 



1 

I 

AT6C0GGTGC 
GT6CTGG6TG 
CACTACCTGT 
CTTTTTGCCT 
TCCCCQCG6C 
TTGCGCAAGT 
GTCGT6GATG 
GG0G6CACC6 
GGTGAGA06G 
A6CACCTTCT 



11 
I 

AGCTGA06AC 
GCATCCTGGC 
CCTTCGGCCT 
TCCTGGAGCA 
GGGGCTCGGT 
GCCTG06CTC 
GCAACC6CCA 
AGCAG6C06G 
AGGCCA6CCT 
06TGCATCAT 



21 
I 

AGCCCTGOGT 
A0CCTATGT6 

GTACGGCGCC 
CCGGCGCATG 
GGCACTGTGC 
G6CCCAGCGC 
GGAGGAOGCC 
CTTCTTTGTG 
OCAGGAGGGC 
GCAGAA6TGG 



31 

I 

OTGGTGGOGA 
ACG6GCTA0C 
ATCCTGGGCC 
CGAOGTGCOG 
AT760CGCGT 
ATCTCCTTCC 
TACAT6CTGG 
TGGCGCAGCA 
ATGGACCGTG 
GGAGGCAAGC 



41 
I 

CCAGCCTGTT 
AGTTCATCCA 
TGCACCTGCT 
GCCAGGCCCT 
ACCAG6AGGA 
CTQAOCTCAA 
ACATCTTCCA 
ACTTCCATGA 
TGOGGGATGT 
6CGAG6TCAT 



420 
460 

540 
600 
660 
720 
780 
840 
900 



60 
120 
IBO 
240 
300 



31 


41 


51 




1 

CC0QAGCTC6 


1 

GAGGC6GCAC 


1 

ACCCAGGGGG 


60 


CCCCGCCAGG 


AAAGCAGGTG 


CCCAGGCCAT 


120 


GCXGCTTOGC 


GGAGGAGC6C 


AGGCCCTGGA 


180 


CGGATGCTCC 


CCGAACAAGA 


TGAAGACAGT 


240 


GGA6GC06TG 


GGGGOGGTGG 


AGACCATCCA 


300 


0GGTTO3GGA 


CTCCCCGGCA 


AGAATGACCG 


360 


CATCCAGCTG 


CAGCAATGCG 


CrCAGGATCG 


420 


GGCGCTCGAC 


COGGCAGGTA 


AT6A6AGTGC 


460 


CT6T6TGGGC 


CTGAGCC6GG 


AG606TGCCA 


540 


CAAC6QCA6C 


GATCATGTCT 


ACAAGGGCTG 


600 


TAATGTGACT 




CTGTCOGGGG 


660 


TGGAG7AACA 


GGCCCAGGGT 


TCACGCTCAG 


720 


CTCTGACCTC 


CGCAACAAGA 


CCTACTTCTC 


760 


CCCTCCA6AG 


CCCA0GACT6 


TQQGCTGAAC 


640 


AGTGA6AC0C 


ACATCCACCA 


CCAAACCCAT 


900 


GGGA6TA6AA 


CACGAGGCCT 


CCGGGGATGA 


960 


CCACCAGGAC 


CGCAGCAATT 


CAGGGCAGTA 


1020 


TAATAAAGGC 


TGTGTGGCTC 


CCACAGCTGG 


1080 


TGGTGTCCTA 


CTGTGA6CTT 


CTOCAOCTGG 


1140 


T6G6TACCCC 


TCTTCTCATC 




1200 




ACATTCCCCA 


GTATCCCCAG 


1260 


AATAAAATAC 


OGTTGTATAT 


ATTCTGGCAG 


1320 


GTATCCTTCT 


CATCCTTGTC 


TCTCOGCTTG 


1380 


AAGTCAGCTG 


TCA06G66AA 


GGTGAGAaAO 


1440 


TAOCCAGCCT 


GGACTTTGGA 


GOQTGGGOTG 


1500 


TGCCTCOOCT 


ACTCCCCGCA 


TCTTTGGGGA 


1560 


CTGTGAOCTC 


CTCGAGGGCA 


GGGACCGTGC 


1620 


CATAAATGCC 


TCAA7AAAGA 


TTTAATTACT 


1680 


31 


41 


51 




1 

LECYSCVQKA 


1 

DDGCSPNKMK 


1 

TVKCAPGVDV 


60 


DRGLDIjHGLL 


AFIQLQQCAQ 


DSC3IAKLNLT 


120 


OQGTSPPWS 


CY21ASDHVYK 


GCFIXaiVTLT 


180 


LSGSCOQGSB 


QiSDLSzncnr 


FSPRIPPLVR 


240 


PMPAPTSQTP 


RQ6VEHBASR 


DEEPRLTGGA 


300 


A6LAALLLAV 


AAGVU. 







51 

I 

TGCCCTGGCA 
CA06GAAAA6 
CATTCAGAGC 
QAAGCTGCCC 
CCCTGACTAC 
G6TG6TCATG 
OGAOGTGCTG 
GGCAG606AG 
G6TGCGGGCC 
GTACACG6CC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
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TTCAAGGCGC TCGGOGftTTC GQTGGACTAC ATCCAGGTGT 6GQACTCTGA CACI6TGCTG 660 

GATCCAGOCT GCACCAT06A GAT6CTT0GA GTCCI6GAGG AGGHTCOCCSl AGTAGGG66A 720 

GTOSGGGGAG ATGTCCAGAT CCTOVACAAG TAOtSACTCAT GGATTTCCTT CCTGAGCASC 780 

GTGCGGTACT GGATGGCCTT CAACGTGGAG CGGGCCTGCC AGTCCTACTT TGGCTGTGTG 840 

CAiGTGTATTA GTGGGCCCTT GGGCATGTAC 06CAACAGCC TCCTCCAGCA GTTCXrTGGAG 900 

GACTGGTAOC AlplSAAGTT CCTA6GCAGC AAGIGCAGCT TC6GGGATGA COGGCACCTC 960 

ACCAACOGAG TCCTGAGCXTT TGGCTACGGA ACTAAGTATA CC606G6CTC CAAGTOCCTC 1020 

ACAGAGACCC CCACTAAGTA CCTCOGGTGG CTCAACCAGC AAACCCX3CT0 GAGCAAGTCT 1080 

TACTTCCGGG AGTGGCTCTA CAACTCTCTG TGGTTCCATA AGC31CCACCT CTGGATGACC 1140 

TAOGAGTCAG 7X3GTCAOGGG CTTCTTOCCC TTCTTCCTCA TTGCCA06GT TATAOIGCTT 1200 

TTCTAC0GG6 GGCGCATCTG 6AACATTCTC CTCTTOCTGC TGA0G6TGCA GCTGGTGG6C 1260 

ATTATCAAG6 CCACCTA06C C r GC n t.X.Tr 0GGG6CAATG CAGAGATGAT CTTCATGTGC 1320 

CTCTACTCCC TCCTCTATAT GTCCAGCCTT CTGCCJGGCCA AGATCTTTGC CATTGCTACC 1380 

ATCAACAAAT CTGGCTGGGG CACCTCTGGC CX5AAAAACCA TTGTGGTGAA CTTCATTGGC 1440 

CTCATTCCT6 TGTCCATCIG 6GT66CAGTT CTCCTG6GAG G6CTGGCCTA CACAGCTTAT 1500 

T6CC3U3GAOC TGTTCAQTGA GACAGAGCTA 60CTTCCTTG TCTCTGGGGC TATACT6TAT 1560 

GGCTGCTACT G0GTG6C0CT CCTCATGCTA TATCTG GCCA TCATOSOCOO GGSATQIG60 1620 
AAGAAGC06G AGCAGTACAG CTTGGCTTTT 6CTGA0GTGT OA 

Seq ID KO: 571 Protein sequence 
Protein Accession fts NP_005320.1 

1 11 21 31 41 51 

1)1111 

MPVQLTTALR WGTSLFALA VLGGILAAYV TGYQFIHTEK HYLSFGLYGA IIjGLHLLIQS 60 

LFAPLEHRRM RRAGQALKLP SPRRGSVALC lAAYQEDPDY LRKCLRSAQR ISFPDLKWM 120 

WDGNRQEDA YMLDIPHEVL GGTEQAGFFV WRSNFHEAGE GETEASLQEG MDRVRDWRA 180 

STFSCXMQKH GGKREVMYTA FKALGDSVDY IQVCDSDTVL DPACTIEMliR VI*£EDPQVGG 240 

VQGDVQILNK YDSWISFLSS VRYWMAFNVE RAGQSYFGCV QCISGPLOW RNSLLQQFLB 300 

DWYHQKPLGS KCSFGDDRHL TNRVLSLGYR TKYTARSKCL TETPTKYLRW UIQQTRWSKS 360 

YFRBWLYNSL WFHKHHLWMT YESWTGFPP FFLIATVIQL FYRGRIWNIL LFLLTVQLVG 420 

IIKATYACFL RGNAEMIFKS LYSLLYKSSL LFAKIFAIA7 ZNKSGHGTSG RKTIWNFIG 480 

LIFVSIWVAV LLGGLAYTAy OQDLFSBTEL AFLVSQAILY GCYHVALUOi YLAZIARRGO 540 
KKPBQYSLAP AEV 

Seq ID NO: 572 DNA sequence 

Nucleic Acid Accession S: Eos sequence 

Coding sequence t 146*7095 

1 11 21 31 41 51 

I I I i I I 

CACACATAOG CACX3CACGAT CTCACTTGGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTOG CTCCCCCTCC CTCTCCACTC T6AGAA0CAG AGGAGCOGCA 120 

G0G06AGGGG COGCAGACOG TCTGGAAAT6 CGAATCCTAA AGGGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCC6 CCTGGATTGG GCTAATG6AT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTQOCTG GTCCTATACA GGAQCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA GATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG T6AATCTTAA GAAACTTAAA TTTCAGGGTT OGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GT6GAAATTA ATCTCSUnTAA T6ACTAC0GT 480 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TCTCATCTGA TGGATCAGAG CATAGrTTAG AAGGACAAAA ATTTCCACTT 600 

GA6ATGCAAA TCTACTGCTT TGATGOGGAC GQATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

0QAAAAG66A AOTTAAGAGC TTTATOCATT TT6TTT6AGQ TTGGGACA6A A6AAAATTT0 720 

GATTTCAAAO 0GATTATT6A TG6AG70GAA AGT6TTAGTC GTTTTGG6AA GCAGGCTGCT 760 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTOGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AAT6CAACAA 960 . 

TCTQ6TTAT0 TCATQCTGAT QQACTACTTA CAAAACAATT TTCGAQAGCA ACAGTACAAO 1020 

TTC7CTAQAC AGGTOTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC C3V6AAAATX3T TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CA6TTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTT6 1260 

GGTQCTATTC TCAATAA7TT QCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TT6TCGACAT GCCTACTGAT 1360 

AATCCTGAAC TTGATCTTTT CCCT6AATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAQAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTCGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

A03AAATACA ATGAAGOCAA GACTAACCGA TCCCCAACAA GAG6AAGTGA ATTCTCTG6A 1620 

AAGQGT6AT0 TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1660 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGA0CTGAA6 ATTCTTCAGG CTCCAGTOOC 1920 

GCSACTTCTG CTATCCCATT CATCTCTGAO AACATATCXX: AAGGGTATAT ATTTTCCTCC 1980 

6AAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATOCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCOG ATGTTGGATC AGGCA6AGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGA6ATA CQTGTT6ATG AATCTGAGAA GACAAOCAAG 2220 

TCCTTTTCTG C AGGO CX3«3T GAIXTTCACAG GGTCCCTOUS TIACAGATCT GGAAATSCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCXX3^TCC 2340 

TCCAGACAAC AGGATTTGGT CTCGAOSGTC AAOSTGGTAT ACTCGCAGAC AACCCAACOG 2400 

6TATACAATG GTGAGACACC TCTTCAACCT TCCTACAGTA GTGAAGTCTT TCCTCTA6TC 2460 

ACCOCTTTCT TGCrTBAOiA TCAGATOCTC AACACTACCC CTGCTOCTTC AAGTAGTOAT 2520 

TOGGOCTTGC ATOCTAOGCC TGTATTTOCC AGTGTCX3AT0 TCTCATTTGA ATCCAT C CTG 2560 

TCTT0CTAT6 ATG6T6CACC TTTQCTTCCA TTTTCCTCTG CTTCCTTCAG TAGTGAATTG 2640 

TTTOGCCATC TGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCAGC TACOGAGAGT 2700 

GAXAA6GT6C OCTTGCATGC TTCTCTQCCA GTGGCTGGGG GTGATTTGCT ATTAGAGCCC 2760 
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AGCCTTGCTC AGTATTCTGA TGTGCTGTCC ACTACTCATG CTGCTTCAGA GACXSCTGGAA 2820 

TTTGGTAGTQ AATCTGGTGT TCTTTATAAA ACGCTTATGT TTTCTCAAGT TQAACCACCC 2880 

AGCAGTGATQ CCATGATGCA TGCAOGTTCT TCAtSaXCTG AACCTTCTTA TGCCTTGTCT 2940 

GATAATCAGO GCTCCCAACA CATCTTCACT GTTTCTTACA GTTCTGC3VAT AOCTGTGCAT 3000 

GATTCTGTCG GTGTAACTTA TCAGGGTTCC TTATTTAGGG GCCCTAGGCA TATACCAATA 3060 

OCTAAGTCTT OGTTAATAAC CCCAACTGCA TCATTACTGC AQCCTACTCA TGCCCTCTCT 3120 

GGTCATOGOG AATGGTCTG6 AGCCTCTTCT GATASTGAAT TTCTTTTAOC TGACACAGAT 31B0 

GGGCTGACAQ CCCTTAACAT TTCTTCACCT GTTTCTCTAG CTGAATTTAC ATATACAACA 3240 

TCl'GTG'mx S GTGATGATAA TAAGGOGCTT TCTAAAAGTG AAATAATATA TGGAAATGAG 3300 

ACTGAACTGC AAATTCCTTC TTTCAATGAG ATGGTTTACC CTTCTGAAAG CACAGTCATG 3360 

COCAAC ATGT ASGATAATGT AAATAAfl TTG A ATGOSTC TT TACAA6AAAC Cl ' C TCi T i 'CC 3420 

ATTTCTAGCA CCAAGG6CAT GTTTCCAGGG TCCCTTGCTC AXAOCAOCAC TAAGGTTTTT 3480 

GATCATGAGA TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCAACCTAC ACATACTGTC 3540 

TCTCAAGCAT CTGGTGACAC TTC3GCTTAAA CCTGTGCTTA GTGCAAACTC AGAGCCAGCA 3600 

T CCTCT GACC CTGCrrCTAG T6AAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAO 3660 

AOCTCAGCTT CTTTTAGTAC TQAAGTATTQ CTACAAOCTT GCTTTCAOGC TTCTGATGTT 3720 

GACACCTTGC TTAAAACTGT TCTTCCAGCt O'ltSOOCAGTO ATCCAATATT G8TTGAAA0C 3780 

COCAAAGTTG ATAAAATTAG TTCTACAATG TTGCATCTCA TTGTATCAAA TTCTGCTTCA 3840 

AGT6AAAACA TCCTGCACTC TACATCTGTA CCAGTTTTTG ATGTGTOJOC TACTTCTCAT 3900 

ATGCACTC TG CTTCACTTCA AGGTTT6ACC ATTTCXTATG C3UUSTGAGAA ATATGAACCA 3960 

Gl i'llGlTAA AAAGT6AAAG TTOOCACCAA GTGGTACXrTT CTTTGTACAO TAATGATQAO 4020 

TT6TT0CAAA OGGOCAATTT G6AGATTAAC CAGGCOCATC CCCCAAAAG6 AAGGCAT6TA 4080 

TTTGCTACAC CTGTTTTATC AATTQATGAA CCATTAAATA CACTAATAAA TAAGCTTATA 4140 

CATTCOGATC AAATTTTAAC CTCCACCAAA AGTTCTGTTA CTGGTAAGGT ATTTGCTGGT 4200 

ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACIGATC ATTCTGTTOC TATAGGAAAT 4260 

GQGCATGTT6 OCATTACAGC TGTTTCTCCC CACAGAGAT6 GTrCTCTAAC CTCAACAAAO 4320 

TT6CCGTTTC CTTCTAAGGC AACTTCTGA6 CTGAGTCATA GIGOCAAATC TGAT6CGQGT 4380 

TTAGTX3GGTG GT6GTGAAGA TGGTGACACT GATGATGATG GTGATGAT6A TX3ATGATGAC 4440 

AGAGGTAGTG ATGGCTTATC CATTCATAA6 TGTATGTCAT GCTCATCCTA TAGAGAATCA 4500 

CAGGAAAAGQ TAATGAATGA TTCA6ACACC CAOGAAAACA GTCTTATGGA TCAGAATAAT 4560 

CCAATCTCAT ACTCACTATC T6AGAATTCT GAAGAAGATA ATA6AGTCAC AAGTGTATCC 4620 

TCAGACAGTC AAACTG6TAT 66ACAGAAGT CCTGGTAAAT CACCATCA6C AAATGGGCTA 4680 

TCCCAAAAGC ACAATGATGG AAAAGAGGAA AATGACATTC AGACTGGTAG TGCTCTGCTT 4740 

CCTCTCAGCC CTGAATCTAA AGCATGQGCA GTTCTGACAA GTGATGAAGA AAGTGGATCA 4800 

66GCAA0GTA CCTCAGATAG CCTTAATGAG AATGAGACT7 CCACAGATTT CAGTTTTGCA 4660 

SACACTAATG AAAAAGATGC T6ATGGGATC CTGGCAGCAG GTGACTCAGA AATAACTCCT 4920 

GGATTCCCAC AGTCCCCAAC ATCATCTGTT ACTA60QAGA ACTCAGAAGT GTTCCAOGTT 4980 

TCAGAGGCAG AGGCCAGTAA TAGTAGCCAT GAGTCTOGTA TTGGTCTAGC TGAGGGGTTG 5040 

GAATCCGAGA AGAAGGCAGT TATACCCCTT GTGATCX5TGT CAGCCCTGAC TTTTATCTGT 5100 

CTAGTGOTTC TTGTGQGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGCACACTTT 5160 

TACTTAGAGO ACAGTACATC CCCTAQAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 5220 

ATTTCAGATO ATGTCGGAGC AATTCCAATA AAGCACTTTC C3^AAGCATGT TGCAGATTTA 5280 

CATGCAAGTA GTGGGTTTAC TGAAGAATTT 6AGACACTGA AAQAGTTTTA C!CAGGAAOTG 5340 

CAGAGCTGTA CTGTTGACTT AGGTATTACA GCAGACAGCT CCAACCACGC AGACAACAAG 5400 

CACAAGAATC GATACATAAA TATOSTTGCC TATGATCATA GCAGGGTTAA GCTAGC31CAG 5460 

CTTGCTGAAA AGGATGGCAA ACTGACTGAT TATATCAATG CCAATTATGT TGATGGCTAC 5520 

AACAGACCAA AAGCTTATAT TGCTGCCCAA GGCCX3VCTGA AATCCACAGC TGAAGATTTC 5580 

TXX3AGAATGA TATGOGAACA TAATGTGGAA GTTATTGTCA TGATAACAAA CCTOGTG6AG 5640 

AAAGGAAGGA GAAAATIQTGA TCAGTACTGG CCTGCOGArG GGAGT6AGGA GTACG66AAC 5700 

TTTCTGGTCA CTCAGAAGAG TGTGCAAGTQ CrPGCXTTATT ATACTGTGAG GAATTTTACT 5760 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA GACCCAGTGG ACGTGTGGTC 5820 

ACACAGTATC ACTACA06CA GTGGCCTGAC ATGGGAGTAC CAGAGTACTC CXTTGCCAGTO 5880 

CTGACCTTTG TGAGAAAGGC AGCCTATGCC AA60GCCATG CAGT6GGGGC TGTTGTOGTC 5940 

CACTGCAGTG CTGGAGTT6G AAGAACAGGC TlCATATATTG TGCTA6ACA6 TATGTTGCAG 6000 

CAGATTCAAC AOGAAGGAAC TGTCAACATA TTTGOCTTCT TAAAACACAT CCX3TTCACAA 6O60 

AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 6120 

OCXATACTTA GTAAAGAAAC TGA6GTGCT6 6ACAGTCATA TTCAT60CTA T6TTAATGCA 6180 

CTCCTCATTC CT0GA0CA6C A06CAAAACA AAGCTAfiAGA AACAATTCCA GCTCCTQAGC 6240 

GAOTCAAATA TACAGCA6A6 TGACTATTCT GCAGOCCTAA AGCAATGCAA CA6GGAAAAG 6300 

AATOGAACTT CTTCTATCAT CCCTGTGGAA AGATCAA6G0 TTGGCATTTC ATCCCTGAGT 6360 

GGAGAAGGCA CA6ACTACAT CAATGCCTCC TATATCATGG QCTATTACCA GAGCAATQAA 6420 

TTCATCATTA OCCAGCACCC TCTCCTTGAT AOCATCAAGG ATTTCTGGAG GATGATATGG 6480 

GAOCATA ATQ COCAACTGGT QOTTATGAXT CCTGATQGGC AAAACATG6C AGAAGA3X3AA 6540 

TTTGTTTACT 6GCCAAATAA AGATGAGCCT ATAAATTGTG ACSUaCTTTAA GGTCACTCTT 6600 

ATGG C T6AAG AACACAAATG TCTATCTAAT GAGGAAAAAC 7TATAATTGA GGACTTTATC 6660 

TTAGAAGCTA CACAGGATGA TTATGTACTT GAAGTSAQGC ACTTTCAGTO TCCTAAATGG 6720 

CCAAATCCAG ATAGOOCCAT TAGTAAAACT TTT6AACTTA TAA6T6TTAT AAAAGAAGAA 6780 

GCTGCCAATA OGGATGGGCC TATGATTGTT CAT6AT6AGC ATG6AGGAGT GAOGGCAGGA 6840 

ACTTTCT6TG CTCTQACAAC CCTTATGCAC CAACTAGAAA AAGAAAATTC OGTGGATGTT 6900 

TAOCAGGTAG CCAAGATGAT CAATCT6ATG A6GCCA6GAG TCTTTGCTGA CATTGAGCAG 6960 

TATCAGTTTC TCTACAAAGT GATCCTCAGC CTTGTGAGCA CAAGGCAGGA AGAGAATCCA 7020 

TCCACCTCTC TGGACAGTAA TGGTGCAGCA TTGOCTGATG 6AAATATAGC T6AGA6CTTA 7080 

GAOTCTTTAG TTTAACACAO AAAOGGGTOQ GGGGACTCAC AZCTGA6CAT m TlTiX X lV 7140 

TTOCTAAAAT TASGGAGGAA AATCAGTCTA GTTCTGTTAT CXGTT6ATTT CCCATCACCT 7200 

GACA GTAACT TTCATGACAT AGGATTCTGC C6CCAAATTT ATATCATTAA CAATGTGTGC 7260 

CTTrrPGCAA GACTTGTAAT TTACTTATTA TGTTTGAACT AAAATGATTG AATTTTAC3U3 7320 

TATTTCTAAG AATGGAATTG TGGTATTTTT TTCTGTATTG ATTTTAACAG AAAATTTCAA 7380 

TTTATAGAOG TTAOQAATTC CAAACTACAO AAAATGTTT6 TnTTAGTGT CAAATTTTTA 7440 

0CTGTATTT3 TAGCAATTAT CAGGTTTQCT AGAAATATAA CTTTTAATAC AfiTAGCCTGT 7500 

AAATAAAACA CTCTTCCATA TGATATTCAA CATTTTACAA CTGCAGTATT CACCTAAAGT 7560 

AGAAATAATC TX3TTACTTAT TGTAAATACT GCCCTAGTGT CTCCATOQAC CAAATTTATA 7620 

TTTATAATT G TAGATTTTTA TATTTTACTA CTGAGTCAAG TTTTCTAGTT CTGTGTAArT 7680 

GTTTAOTTTA ATGAOGTAGT TCATTAGCTG GTCTTACTCT AOCAGTTTTC TGACATTGTA 7740 

TTGTGTTACC TAAOTCATTA ACTTTGTTTC AGCATGTAAT TTTAACTTTT GTOOAAAATA 780 0 

GAAAT ACCTT CATTTTGAAA GAAGTTTTTA T6AGAATAAC AOCTTACCAA ACATTGTTCA 7860 

AATGGTTTTT ATCCAAGGAA TTGCAAAAAT AAATATAAAT ATTQOCATTA AAAAAAAAAA 7920 
AAAAAAAAAA AAAAAAAAAA AAAA 
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8eq ID H0> 573 Protein' sequence t 
Protein Accession 8: Eos sequence 

X II 21 31 41 51 

I 1 I I I I 

MRZLKRFIAC IQLLCVCRLD HANGYYRQQR KLVEBIGI^Sy TSAUIQXNWG KtOrPTCNSPK 60 

QSPIHIDEDL TQV27VKLKKL KFQGNDKTSL ENTFIHNT6K TVBINLTNDY RVSGGVSEMV 120 

FKASXITFHW GKCKKSSDGS EKSLEGQKFP LQ4QIYCFDA DaPSSFEEAV KIGKGKLRALS 180 

ILFEVGTEEtl LDFKAIZDGV BSVSRFGKQA ALDPFZLZJOi LPNSTDKm YSGSLTSPPC 240 

TDTVDHZVFK DTVSZSBSQli AVFCBVLTMQ QSGYVKLKDy LOKRFREQQY KPSHQVPSSY 300 

TGXEBZEBAV CSSEPEKVQA DPE2iyTSLLV THERFRWyD THZEKFAVLY QQU)GEDQTK 360 

HSFVTDCYQD LGAZLNKLLP NMSYVLQIVA ICTNGLYGKY SDQLIVDMFT ONPELDLFPB 420 

LIGTEBIIRE EEEGEDIBEG AZVNFGRDSA TNQZRKKEPQ ISTTTHYNRZ 6TKV2JEAICIK 460 

RSPTRG5EPS GKGDVFNTSL NSTSQPVTKL ATEKDXSIiTS QTVTELPFBT VEGTSASU3D 540 

GSRTVLRSFB HHLSGTABSb NTV5ITBYEB BSLLTSFKZiD T6AEDSS6SS PAT6AZPFZS 600 

EHZSQGYIFS SEHPETZTVD VLZPBSARNA SEDSTSSGSE ESLKDPSME6 HVHFPSSTDZ 660 

TAQPDVGSGR ESFLQTNYTE IRVDBSKICTT KSFSAGPVMS QGPSVTDLEM PHYSTFAYPP 720 

TBVTPHAFTP SSRQQDLVST VNWYSQTTQ PWNGETPLQ PSYSSEVFPL VTPLIiLDNQI 780 

UJTTPAASSS DSALHATPVF PSVDVSFESI IiSSYDGAPIi PPSSASFSSE LFRHLHTVSQ 840 

ILPQV7SATB SDKVPLHASL PVAG(H>LLLE PSLAQYSDVL S7THAA5STL EFGSESGVLY 900 

KTLMPSQVEP PSSDAMMaAR SSGPBPSYAL SSNB6SQHIF TVSYSSAIPV HDSVGVTYQ6 960 

SLFSGPSHZP IPKSSLZTPT ASXjLQPTHAL SGDGEWSGAS SDSEFLLPDT XX3LTALNISS 1020 

PVSVABFTYT TSVFGDDHKA LSKSEIIYGN ETELQIPSFN EMVYPSESTV MPNMYDNVNK 1080 

LNASLQETSV SISSTKGMFP GSIiAHTTTKV EDHEISQVPB NNPSVQPTHT VSQASGDTSL 1140 

KPVLSAKSEP ASSDPASSEM LSPSTQLLFY ETSASFSTEV U^QPSFQASD VDTLLKTVLP 1200 

AVFSDPILVB TPKVDKISST MLHLIVSHSA SSENMUISTS VFVFDVSPTS HKHSASbQGL 1260 

TZSYA5EKYB FVLLKSESSR QWPSLYSMD BLFQTANLEZ NQABPPKGHH VFATPVLSZD 1320 

EPZiHTLINKL ZHSDEZLTST KSSVTGKVFft GZPTVASOTF VSTDB8VPI6 NGHVAITAVS 1380 

PHRD6SVTST KLLPPSKATS ELSHSAKSDA 6LVGGGED6D TQDDCaiDDDD DR6SDGLSIH 1440 

K01SCSSYRE SQEKVMMDSD THENSLKDQH KPISYSLSE3I SEEDNRVTSV SSDSQTGMDR 1500 

8PGKSPSAN6 L8QKH21DGKE EKDZQItSSAL LPLSPESKAH AVLTSDEESG SGQGTSDSLM 1560 

ENETSTDFSF ADTOBRZIADO IIiAAffl>SEZT P6FPQSPTSS VTSEHSBVFH VSEAEASHSS 1620 

HESRZGLAE6 LBSEKRAVIP LVZVSALTFI CLWLVGI&Z YHRXCFQTAa FYLEOSTSPR 1680 

VZSTPPTPIF PISDDVGAIP IKHFPKHVAD LHASSGPTEE FBTLKBFYQE VQSCTVDLGI 1740 

TADSSNHPDN KHKNRYIKIV AYDHSRVKIA QIiAEKDGKLT DYIKANYVDG YNRPKAYIAA 1800 

QGPLKSTASD FWRMIHEBNV EVZVMZTMIiV BKORRKCDQY WFADGSBEYG NFLVTQKSVQ 1860 

VLAYYTVBSF TLRNTKZKKG SQRGRPSGRV VTQYBYTQHP DM6VPBYSLP VLTFVRKAAY 1920 

AKREAVGPW VBCSAGVGRT GTYZVIASNL QQIQHBGTVN ZFGFLKHIRS QRNYLVQTEE 1980 

QYVPIRDTLV EAILSKETEV LDSHZHAYVN AUiZPGPAGK TKLEKQFQIfL SQSNIQQSDY 2040 

SAALXQCNRE KNRTSSIIPV ERSRVGZSSL SGEGTDYIHA SYIKGYYQSN BFIITQHPLIj 2100 

HTZKDFWRMZ HDHNAQLWM IFDGQimAED EFVYHPMKDS PINCESFKVT UfAEEHKCLS 2160 

NEEKLZZQOP XI£ATQDDYV LEVSHFQCPR WFNPDSPZSK TFBLZSVZXE EAANRDGPMZ 2220 

VHDBHG6VTA QTFCALTTZM BQLEKa^SVD VYQVAKNZNL MRPGVFADZB QYQFLYKVZL 2280 
SLVSTRQSEIT PSTSU}SNGA ALPDCaTIAES LESLV 

Seq ID NO: 574 D2IA sequence 

Nucleic Acid AecesBion Ut Bob sequence 

Coding sequence: 148-4518 

1 11 21 31 41 51 

1 I I I I I 

CACACATAGO CAOOCAOQAT CTCACTTGQA TCTATACACT G6AG8ATTAA AACAAAOUUl 60 

auUUUUMC ATTTCCTTC38 CTCTCCCTCX: CTCTOCACTC TGAGM6CA6 AG6AGCCGCA 120 

CX3GCQAGGGG OCGCAGACOG TCTGGAAATG C6AATCCTAA AGCGTTTCCT OGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCT6 GTCCTATAOV GGAGOVCTGA ATCAAAAAAA TTGGGGAAAS 300 

AAATATOCAA CATGTAATAG CCCAAAAGAA TCTOCTATCA ATATTGATGA AOATCTTACA 360 

OmCTAAATG TGAATCTTAA 6AAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGOGAAAACA GTGGAAATTA ATCTCACTAA TGACTACC6T 480 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCftTCTGA TG6ATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGA7G0GGAC G6ATTTTCAA GTTTTGAGGA ASCAQTCAAA 660 

G6AAAAGGGA AGTTAA6ABC TTTATOCATT TTGTTTGAGG TTGGGACA8A AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTOSAA AGT6TTAGTC GTTTTGGGRA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TGCCTGCACA GACACAGTTG ACTGGATrGT TTTTAAAQAT 900 

ACAGTTAGCA TCTCTGAAAO CCAGTTGGCT tfi ' lTniVm AAOTTCTTAC AATGCAACAA 960 

TCT6GTTATG TCATGCTGAT GQACTACTTA CAAAACAATT TTOQAQABCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAG6AAG AGATTCATGA AGCAGTTTGT 1080 

AOTTCAQAAC CAQAAJATGT TCAGGCTGAC CCAGAGAATT ATACCAG(XT TCTTGTTACA 1140 

TGGGAAA6AC CT0GAGTC6T TTATGATACC AT6ATTGAGA AGTTTGCAGT TTTGTACCAO 1200 

CAGTTG6ATQ OAGAGGACCA AACCAA6CAT GAATTTTTGA CAOATGGCTA TCAAOACTTG 1260 

66T6CTATTC TCAATAATTT 6CTACCCAAT ATOUSTTATG TTCTTCAOAT AGtAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTOSACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAQUWVTAAT CAAGGAGGAG 1440 

6AAGA6GGAA AAGACATTGA A6AAGG0GCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAA6GA ACCCCAGATT TCTAOCACAA <3U3^CTACAA TCGCATAGGQ 1560 

AGGAAATACA ATGAAGCCAA GACTAACOGA TCCCCAACAA GAGGAAGTGA A7TCTCTGGA 1620 

AAGG6TGATO TTCCCAATAC ATCTTTAAAT TOCaCTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAA6 ATATTTCCTT GACTTCTCAG ACTGTGACTG AACP6CCACC TCACACTGTG 1740 

GAAGGTACTT CA6CCTCTTT AAATGATGGC TCTAAAAC7G TTCTTAGATC TCCACAXATG 1800 

AACTTGTOSG GGACT6CAGA ATCCTTAAAT AC3VGTTTCTA TAACAGAATA TGAGGAG6AB 1860 

AGTTTATTGA CCA6TTTCAA 6CTTGATACT GGAGCTGAA6 ATTCTTCAQO CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAOGOTATAT ATTTTCCTCC 1980 

GAAAACCCAO AGACAAIAAC ATATGAT6TC CTTATACCAG AATCTGCTAG AAATGCTTOC 2040 

6AA6A3TCAA CTTCSaCAQG TTCAGAAGAA TCACtAAAGG ATOCTTCTAT GGAGG6AAAT 2100 
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OTGTOGTTTC CIAGCTCTAC AGAOIIAACA GCkCAGQCCB ATGTTGOATC AOSCAGAOAG 2160 

A6CTTTCT0C AGACTAATTA CACTGAQATA OST G TTGATQ AATCTQAGAA 6ACAACCAA6 2220 

TCCTTTTCTG CAGG0CCA6T GATGTCACAO GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATOC 2340 

TCXIAGACAAC AG6ATTTGGT CTCCAOGGTC AAOGTGGTAT ACTCGCAGAC AACCCAACOG 2400 

OTATACAATG CAGAG6CCAS TAATAGTA6C CAIGAGTCTC QT A TTGOTCT AGCTGAGGGG 2460 

TIQGAATC08 A6AAGAAGGC AGTTATACCC CTTGTGATGG TGTCAGCCCT GACTTTTATC 2520 

TGTCTAGTGG TTCTTGTGGG TATTCTCATC TACTGGAGCSR AATGCTTCX3V GACTGCACAC 2580 

TTTTACTTAG AGGACAGTAC ATCCCCTAGA GTTATATCCA CACCTCCAAC ACCTATCTTT 2640 

CCAATTTCAG ATGAT6T0G6 AGCAATTCCA ATAAAGCACT TTCCAAA6CA TOTTOCAGAT 2700 

TTACATGCAA GTAGTQ6GTT TACTGAAGAA TTTGACSACAC TGAAAGAGTT TTACCAQGAA 2760 

GT6CA6AGCT GTACTGTT6A CTTAGGTATT ACA6CAGACA 6CTCCAAGCA CCCA6ACAAC 2820 

AAGCAC3^AGA ATOSATACAT AAATAT03TT GCCTATQATC ATAGCAGGGT TAAGCTAGCA 2880 

CAGCTTGCTO AAAAGGATGO CAAACTGACT GATTATATCA ATGCCAATTA TGTTGATGGC 2940 

TACAACAGAC CAAAAGCTTA TATTGCT6CC CAAGGCCCAC TGAAATCCAC AGCTGAAGAT 3000 

TTCT6GAGAA TGATAtGGGA ACATAATGT6 GAAGTTATTG TCAT6ATAAC AAAGCT03T0 3060 

GAGAAA6GAA GQAGAAAAT6 TGATCAGTAC TGGCCTGCGG AT6GGA0TQA G6AGTAC66G 3120 

AACTTTCTGG TCACTCAGAA GAGTGTGCAA GTGCTTGCCT ATTATACTGT GAGGAATTTT 3180 

ACTCTAAGAA ACACAAAAAT AAAAAAOGOC TCCCAGAAAG GAAGACCCAG TGGACGTGTG 3240 

GTCACACAGT ATCACTACAC GCAGTGGCCT GACATGGGAG TACCAGAGTA CTCCC7GCCA 3300 

GTGCTGACCT TT6TGAGAAA G6CAG0CTAT 6CCAAG0G0C ATGCAGTGG6 GCCTGTTGTC 3360 

GTCCACTGCA GTGCTGGAGT TGGAAGAACA GGCACATATA TT6TGCTAGA CAGTATGTTG 3420 

CAGCAGATTC AACAOSAAGG AACTGTCAAC ATATTTGGCT TCTTAAAACA CATCOGTTCA 3480 

CAAAGAAATT ATTTGGTACA AACXGAGGAG CAATATGTCT TCATTCATGA TACACTGGTT 3540 

GAGGCCATAC TTAGTAAAGA AACTGAGGTG CTGGACAGTC ATATTCAT6C CIATGTTAAT 3600 

GCACTCCTCA TTCCTGGACC AGCAGGCAAA ACAAAGCTAG A6AAACAATT CX^GCTOCTO 3660 

AGCCAGTCAA ATATACAGCA GAGTGACTAT TCTGCAGCCC TAAAGCAATO CAACAGGGAA 3720 

AAGAATCGAA CTTCTTCTAT CATCCCTGTG GAAAGATCAA GGGTT6GCAT TTCATCCCTG 3780 

AGTGGAGAAG GCACAGACTA CATCAATGCC TCCTATATCA TGGGCTATTA CCAGA6CAAT 3840 

GAATTCATCA TTACXXAGCA CCCTCTCCTT CATACXATCA AGGATTTCTG GAGGATGATA 3900 

TGGGACCATA ATGCCCAACT GGTGGTTATG ATTCCTGATG GCCAAAACAT GGCAGAA6AT 3960 

GAATTTGTTT ACT6GCCAAA 7AAAGATGAG CCTATAAATT GTGAGA6CTT TAAGGTCACT 4020 

CTTAT08CT8 AAOAACACAA ATGTCTATCT AATG AGGAAA AACTTATAAT TCAGGACTTT 4080 

ATCTTA6AA6 CTACACAGGA T6ATTATGTA CTTGAAGT6A OSCACTTTGA GTGTCCTAAA 4140 

TGGCCAAATC CAGATAGCCX: CATTAGTAAA ACTTTTGAAC TTATAACnXTT TATAAAAGAA 4200 

GAAGCTGCCA ATAG6GATGG GCCTATGATT GTTCATGATG AGCATGGAGG AGTGACGGCA 4260 

GGAACTTTCT GT6CTCTGAC AA0CX:TTATG CAOCAACTAG AAAAAGAAAA TTCCGTGGAT 4320 

GTTTACCAGQ TAGCCAAGAT GATCAATCTG ATGAGGOCAG 6AGTCTTTGC TGACATTSAG 4380 

CAGTATCAGT TTCTCTACAA AQTQATCCTC AGCCrrGTGA GCACAAGGCA GOAAGA^aUlT 4440 

OCATCCJiCCr CTCTGGACAG TAATGGTGCA GCATTGCCTG ATGGAAATAT AGCT6AGAGC 4500 

TTAGAGTCTT TAGTTTAACA CAGAAAGGGO TGGGGGGACT CACATCTGAG CATTGTTTTC 4560 

CTCTTCXTTAA AATTAGGCAO GAAAATCAGT CTAGTTCTGT TATCTGTTGA TTTCCCATCA 4620 

CCT GACAGTA ACTTTCATGA C ATAGG ATTC TGCOSOCAAA TTTATATCAT TAACA ATOTO 4680 

TGCCTTTTTG CAAGACTTGT AATTTACTTA TTATGTTT6A ACTAAAATQA TT6AATTTTA 4740 

CAGTATTTCT AAGAATGGAA TTGTGGTATT TTTTTCTGTA TT6ATTTTAA CAGAAAATTT 4800 

CAATTTATAG AGGTTAGGAA TTCCAAACTA CAGAAAATGT TTGTTTTTAG TGTCAAATTT 4860 

TTAGCT6TAT TTGTAGCAAT TATCAGGTTT GCTAGAAATA TAACTTTTAA TACAGTAGCC 4920 

TGTAAATAAA ACACTCTTOC ATATGATATT CAACATTTTA CAACIGCAGT ATTGACCTAA 4980 

AGTAGAAATA ATCFGTTACT TATTGTAAAT ACTQCCCTAO TGTCTOCATQ QACCAAATTT 5040 

ATATTTATAA TTGTAGATTT TTATATTTTA CTACTGAGTC AAGTTTTCTA GTTCTGTGTA 5100 

ATTGTrrAGT TTAATGACGT AGTTCATTAG CTGGTCTTAC TCTACCAGTT TTCTQACATT 5160 

GTATTGTGTT ACCTAAGTCA TTAACTTTGT TTCAGCATGT AATTTTAACT TTTGTGGAAA 5220 

ATAGAAATAC CTTCATTTTO AAAGAASTTT TTATGAGAAT AACACCTTAC CAAACATTGT 5280 

TCAAATQGTT TTTATCCAAG QAATTGCAAA AATAAATATA AATATTGGCA TTAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAAAAAA 



8eq ID H0> 575 Protein sequence i 
Pzotein Accession #t Eos sequence 

1 11 21 31 41 51 

I I I I I I 

HRXLKSFLAC IQLLCVCSLD KAHGYYRQQR KXiVEEXGHSY TGAUiQXNHG KEYPTCNSFS 60 

QSPIKIDEDL TQVNVHIiKKL KFQGMDXTSL EKTFZHNT6R TVEHiLTHDY BVSOGVSEHV 120 

PKASKZTFHH GKCNMSSDGS ESSLEX3QKFP LEMQIYCFDA DRFSSFEEAV KGK6KLRALS 180 

XLPEVGTEEN LDFKAZIDGV BSVSRFGKQA ALDPFILLNL LPMSTDKYYI YN6SLTSPPC 240 

TDTVDHIVFK DTVSISESQL AVFCBVLTT4Q QSOYVKLMDY LQNHFREQQY XPSRQVFSSY 300 

TGKEEZHEAV CSSEPQWOA DPEVYTSIiLV TN&RPRVVlfD TMIEECFAVLT QQLDeeDQTK 360 

BEPLTDGYQD LGAIUOfLLP NMSYVLQIVA laVGLYCKY SDQI.IVDMFT DNPELDLPPS 420 

LIGTEEIZKE EBE6KDZBE6 AIVNPGItDSA TNQZRKKEPQ ISTTTHYMRI GTKYHBAKTH 480 

RSPTRGSEPS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VEGTSASUJD 540 

GSKTVLRSPH MI7LSGTAE6L NTVSZTEYEB BSLItTSPKLD TGAEDSSOSS PATSAZPFIS 600 

ESZSQGYZPS SBNPETITYD VLIPBSARNA SED8TSSGSB BSLKDPSMB6 NVWFPSSIDI 660 

TAQPDVGSGR ESFLQTNYTS IRVDBSEKTT KSFSA6PVMS QGPSVTDLOf PHYSTFAYPP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNAEASKS SHESRZGLAE GLESEKKAVZ 780 

PLVIVSALTP ZCLWLVGZL ZYWRKCPQTA HFYI^DSTSP RVZSTPPTPI PPISDDVGAI 840 

PZREFPKBVA DLHASSGFTE EPBTUCEFyQ EVQSCTVDLG ZTADSSKHPD 2IKEKI3RYIKZ 900 

VAVDKSRVRL AQLAEKDGKL TDYZKANYVD 0YNRP2AYIA AQGPLKSTAB DFWRMIWEBH 960 

VEVrVMiniL VEKGRRKCDQ YWPAD6SSEY QIFLVTQKSV QVLAYYTVBN FTLRNTKIRK 1020 

GSQKGRPSta WTQYHYTQW PDMGVPEYSL PVLTFVRKAA YAKRHAVGPV WHCSAGVGR 1080 

TGTYZVLDSM LQQZQHEGTV KIFGPLKHZR SQHUYLVOTB BQYVFZHDTL VEAZLSKETE 1140 

VLDSHZHAYV KALLZPGPAG KTKLEKQPQL hSQSSlQQSD YSAALKQCNH BKZIRTSSZIP 1200 

VERSRVGZ8S LSGBGTDYIS ASYZKGYYQS KEFXZTQEPL laTXKDFHBM ZHDBRAQLW 1260 

NZPDGQNMAB DEFVYHPHXD EPIHCESFKV TLMAESBXCL SNEBKLZI^ FILEATQDDY 1320 

VLEVHHFQCP KHFHPDSPIS KTFELISVIK EBAANSOGFN ZVHOEBOGVT AGTFCALTTL 1380 

KHQIiEKEKSV DVYQVAXMIN ZiKRPGVFADZ BQYQFLYRVZ LSLVSTRQEB NPSTSLOSHG 1440 
AALPD6I7XAB SXiESLV 
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Seq ID NO: 576 DNA segaence 

Nucleic Acid Accession S: BOS sequence 

Coding sequence: 148-4494 

1 11 21 31 41 SI 

) I I I I I 

CaCACATACG C31CGCA0GAT CTCACTTOGA TCTATACACT CGAGQATTAA AACAAACAAA SO 

CAAAAAAAAC ATTTCCTTOS CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

0G60GAGGGG C06CAGA006 TCTGGAAATQ 06AATCCTAA AGCGTTTCCT 06CTTGCATT 180 

CAGCTCCICT GTO I TT OO OG CCTGQATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

C riXrn t S AAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGG6GAAA6 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATQA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGC3GTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGC3AAATTA ATCTCACTAA TGACTACOGT 480 

GTCAGC6GAG GAGTTTCA6A AATGGTGTTT AAA6CAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCA6AG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GA6ATSCAAA TCTACIGCTT TGAT6CAGAC OQATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTITCAGG TTGGGACAGA AGAAAAmG 720 

GATTTCAAAG CX3ATTATTGA TGGAtSTOGAA AGTGTTAGTC OTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTG6ATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GlTrmVAt l t AAGTrCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTOGAGAOCA ACACTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAQQAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTOQAGTOGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CA0TT06ATG GAQAOGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATC GCTTATATGG AAAATACAGC GAOCAACTGA TTGTOGACAT GCCTACTGAT 13 BO 

AATCCTGAAC TTGATCTTTT CCCTOAATTA ATTQGAACTG AAGAAATAAT CAAGGAG6AG 1440 

GAAGAGGGAA AAGACATTGA AGAAG6CGCT ATTGTGAATC CT66TAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA AOOOCAGATT TCTACCACAA CACACTACAA TOGCATAGGG 1560 

ACGAAATACA ATGAAGGCAA GACTAACOGA TCOCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATtTCCTT GACTTCTCAQ ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT C3W5CCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG IBOO 

AACTTGTOGG GQACTGCAGA ATOCTTAAAT ACAGTTTCTA TAACAGAATA T6A6GAG6A6 1860 

AOTTTATTSA CCAGTTTCAA GCTTGATACT OSAGCTtSAAG ATTCITCAGG CTCCAGTOCC 1920 

QCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTG6TTTC CTAGGTCTAC AGACATAACA GCACAGCC06 ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA OQTGTTGATG AATCTGA6AA GACAAOCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CXnTTGCXTTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TOCAGACAAC AGGATTT6GT CTCCA0G6TC AAOGTGGTAT ACTOGCAQAC AACCCAACOG 2400 

GTATACAATG AGGCCAOTAA TAGTAOCCAT GAGTCTOGTA TTGGTCZAGC T6AGGGGTTQ 2460 

QAATC06A6A AQAAGGCAGT TATACCCCTT GIGATOGTGT CAGCCCTGAC TTTTATC7X3T 2520 

CTAGTGGTTC TTGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGCACACTTT 2580 

TACTTAGAGG ACAGTACATC COCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTTCAGATG ATGTOGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 2700 

CAT6CAAGTA GTGGQTTTAC TQAAGAATTT GAaOAAGTGC AGAfiCTGTAC TGTTQACTTA 2760 

GGTATTACAG CAGACAGCTC CAACCACOCA 6ACAACAAGC ACAAGAAT06 ATACATAAAT 2820 

ATOGTTOCCT ATGATCATAG CAGGGTTAAG CTAGCACAGC TTGCTOAAAA GGATGGCAAA 2880 

CTGACTGATT ATATCAATGC CAATTATGTT GATGGCTACA ACAGACCAAA AGCTTATATT 2940 

GCTGCCCAAG GOCCACTGAA ATCCACAGCT GAAGATTTCT GGAGAA7GAT ATGGQAACAT 3000 

AAT0TGGAA6 TT A T T OTCAT GATAACAAAC CTOBTQ Q AGA AAGGAAGGAG AAAAT6TGAT 3060 

CAGTACTGGC CT6C0GATG0 GAOTQAGQAG TAOOQGAACT TTCTGQTCAC TCAGAAGAGT 3120 

GTGCAAGTGC TTGCCTATTA TACTGTGAGG AATTTTACTC TAAGAAACAC AAAAATAAAA 3180 

AAGGGCTCCC AGAAAGGAAG ACCGAGTGGA OG7X3TGGTCA CACAGTATCA CTACACGCAG 3240 

TGG O CTGACA TGGGAGTACC AGAGTACTCC CIGCCA6T6C TGACCTTTGT GAGAAAGGCA 3300 

G0CTAT6CCA AGCGCCATGC AGTQGQOGCT GTTGTOQTGC ACTGCA0TGC TGGAOTTGGA 3360 

AGAACAG6CA CATATATTGT GCIAQACAGT ATGTTGCAGC A6ATTCAACA CGAA6GAACT 3420 

GTCAACATAT TTGQCTTCTT AAAACACATC CGTTCACAAA GAAATTATTT GGTACAAACT 3480 

GAOGAGCAAT ATGTCTTCAT TCATGATACA CTGGTT6AGG CCATACTTAQ TAAAQAAACT 3540 

GAGGT6CTGG ACAGTCATAT TCATGCCTAT GTTAATGCAC TCCTGATTCC TGGAOCAGCA 3600 

G6CAAAACAA AGCTAQAQAA ACAATTCCAG CTGCTGAGCC AOTCAAATAT ACAGCA6AGT 3660 

GACTATTCTO CAGCCCTAAA 6CAATOCAAC AGGGAAAAGA ATCQAACTTC TTCTATCATC 3720 

CCTGTGQAAA GATCAAGQGT TOGCATTTCA TCCCT3AGTG GAGAAGGCAC AGACTACATC 3780 

AATGCCTCCT ATATCATGGO CTATTACCAG AGCAATGAAT TCATCATTAC CCAGCACCCT 3840 

CTCCTTCATA OCATCAAGGA TTTCTGGAGG ATGATATGGG ACCATAATGC CCAACTG G TQ 3900 

6TTATGATTC CT6AT66CCA AAACATGGCA GAAGATQAAT TTOmACIO OOCAAATAAA 3960 

GATGA6CCTA TAAATTGTGA GAGCTTTAAG'GTCACTCTTA T66CT6AAGA ACACAAAT6T 4020 

CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT TAGAAGCTAC ACAGOITGAT 4080 

TATGTACTTQ AAOTGAOGCA CTTTCAGTGT CCTMKTGGC CAAATCCAGA TAGCCCCATT 4140 

AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAGAAGAAG CTGCCAATAG GGATGGGCCT 4200 

AT6ATTGTTC AT6AT6A6CA TQQAGGAGTQ AOGGCAGGAA CTT1CTGTGC TCTQACAACC 4260 

CTTATGCACC AACTAGAAAA AGAAAATTGC GTGGATGTTT ACXAGGTAGC CAAGATGATC 4320 

AATCTGATGA OGCCAGGAGT CTTT6CTGAC ATTQAOCAGT ATCAGTTTCT CTACAAAGTG 4380 

ATCCTCAGCC TTGTQAGCAC AAGGCAOGAA GAGAATCCAT CCACCTCTCT GGACS^GTAAT 4440 

GGTGCAGCAT TGCCTGATGG AAATATAGCT GA6AGCTTAG AGTCTTTAGT TTAACACAGA 4500 

AAGG6GTGGG GGGACTCACA TCTGAGCATT GTTTTCCTCT TCCTAAAATT AG0CA06AAA 4560 

ATCAGTCTAG TTCTGTTATC TGTTGATTTC CCATCACCTG ACAGTAACTT TCATGACATA 4620 

GGATTCTGCC QCCAAATTTA TATCATTAAC AATGTGTOOC TTTTTGCAAG ACTTGTAATT 4680 

TACTTATTAT GTTTGAACTA AAATGATTGA ATTTTACAGT ATTTCTAAGA ATGGAATTGT 4740 

GGTATTTTTT TCTGTATTGA TTT7AACAGA AAATTTQU^T TTATAGAG6T TAGGAATTOC 4800 
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AAACTACAGA AAATOTTTGT TTTOGTGTC AAATTTTTAG CTGTATTTOT AGCAATTATC 
AOOTTTGCTA GAAATA TAAC TrTTAATACA GTAOCXTOTA AATAAAACAC TCTTCCATAT 
GATATTCAAC ATTTTACAAC TGCftGTATTC ACCIAAAGTA GAAATAATCT GTTACTTATT 
GTAAATACTG CCCTAGTtSTC TCOVT GGACC AAATTPATAT TTATAATTGT AGATTTTTAT 
ATTTTACTAC TGAGTCAAGT TTTCTAGTTC TGTGTAATTG TTTAGTTTAA TGACGTAGTT 
CATTAGCfGG TCTTACTCTA OCACSTTTTCT 6ACATTGTAT TCTGTTACCT AAGTCATTAA 
CTTTSTTTCA GCATOTAATT TTAACTTTTG TOGAAAATAO fAPOAXXTIC ATTrCGAAAG 
AAGTTTTTAT GAGAATAACA CCTTAOCAAA CATTOTTCAA ATGGTTTrTA TCCAAflOAAT 
TGCAAAAAXA AATATAAATA TTGCCATTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AAA 



PCTAJS02/12476 



Seq ZD HOx 577 Protein sequence « 

Protein Accession §: BOS sequence 



MRIIiKHFLAC 
QSPIKIDEDL 
PKASKITFHN 
ILFBVGTE5H 
TDTVDWIVFK 
T6XEEIHBAV 
EEFLTDGYQD 
LIGTEEIIKE 
RSPTRGSBFS 
GSKTVLR5PH 
ENISQGYXFS 
TAQPDVG5GR 
TEVTPHAPTP 
LVrVSALTFI 
imFPKBVAD 
KLAQLABKDG 
NLVEKGRRKC 
GBWTQYHYT 
SMLQQIQHEG 
YVHALLIPGP 
SSLSGEGTDY 
AEDBPVYWPN 
CPKHPI7FDSP 
SVDVYQVAKM 
PESUESLV 



11 
I 

IQLLCVCRLD 
TQVNVNLKKL 
GKCNNSSZXSS 
LDFRAIZDGV 
DTVSX868QL 
CSSBPENVQA 
LGAILNHLLP 



GKGDVPNTSL 
MNLSGTABSL 
SEUPETITYD 
BSPLOTNYTE 
SSRQQDI.VST 
CLWLVGILI 
LHASSOPTEE 
KLTDYINANY 
DQYWPADGSE 
QWPDMGVPBY 
TVNIPGFIiKH 
AGKTKLSKQP 
INASYIMGYY 
KDEFIHCESF 
ISKTFEIiISV 
ZKLMRPGVFA 



21 
I 

NAKGYYRQQR 
KPQ6WDKTSL 
EBSLEGQKPP 
BSVSRFGKQA 
AVFCEVLTMQ 
DPEHYTSLLV 
NMSYVLQIVA 
AIVKPQRDSA 
NSTSQPVTKL 
NTVSITEYEB 
VLIPBSARNA 
IKVDBSEKTT 
VNWySQTTQ 
YWRKCPQTAH 
PEBVQSCTVD 
VDGYNRPKAY 
EYGNFLVTQK 
SLPVLTFVRK 
IRSQRNYLVQ 
QLLSQSI7IQQ 
QSNEPIITQR 
KVTLKABEKK 
XKEBAANBDG 
DIBQYQFLYK 



31 
I 

KLVKEIGWSY 
ENTPIHNTGK 
LEKQXYCFDA 
ALDPPXLU7L 
QSGYVKIJ4DY 
TWBRPRWYD 
ICTNGLYGKY 
TNQIRKKEPQ 
ATKKUISLTS 
ESLLTSPKU) 
SSDSTSSGSE 
KSPSAGPVMS 
PVYNEASNSS 
PYLEDSTSPR 
liGlTADSSKE 
ZAAQGPLKST 
SVQVLAYYTV 
AAYAKRBAVG 
TEEQYVFZfiD 
SDYSAALKOC 
PUiBTIRDPW 
CLSNEBKLZZ 
MIVHDEHOG 
VILSLV8TRQ 



41 
I 

TGAIi2IQKNWG 
TVEINLTNDY 
DRPSSFEBAV 
LPNSTDKYYI 
LQMNPREQQY 
TMIEKFAVLY 
SDQLIVI»1PT 
ISTTTHYNSI 
QTVTBLPPHT 

tgaedssgss 
bslkdpsmeg 
qgpsvtdlem" 
hesriglaeg 
vistpptpip 
pdkkhknhyi 
aedphhminb 

RNPTLRNTKI 
PWVHCSAGV 
TLVEAILSKE 
NREKNRTSSI 
RMIHDBKAQL 
QDPZIiEATQO 
VTAGTFCALT 
BENPSTSU)S 



51 
I 

KKYPTOJSPX 
RVSGGVSEMV 
KGXGKLRALS 
YNGSLTSPPC 
KPSRQVPSSY 
QQLDGEDQTR 
DNPELDLPPB 
GTKYMEAKTN 
VEGTSASLND 
PATSAIPFIS 
NVWFPSSTDI 
PHYSTPAYFP 
LBSEKKAVIP 
PZSDDVGAIP 
NIVAYDHSRV 
HNVEVZVMIT 



GRTGTYIVLD 
TEVLDSHIHA 
IPVERSRVGI 
VVMXPD6C3MM 
DYVLBVRHPQ 



ZiGAALFDOHZ 



Seq ID NO: 578 DMA sequence 

liucleic Acid Accession #t EOS sequence 

Coding sequence I 501-4514 



1 
I 

CACACATACG 
CAAAAAAAAC 
0Q60GAGGGG 
CAGCTCCTCT 
CTTGTT6AA6 
AATATCCAAC 
AAGTAAATGT 
ACACATTCAT 
TCA00G6AG6 
AAT6CAATAT 
AGATGCAAAT 
GAAAAGGGAA 
ATTTCAAAGC 
TAGATOCATT 
ATGGCTCATT 
CRGTTAGCAT 
■ CTGGTTATGT 
TCTCTAGACA 
GTTCAGAACC 
GGGAAAGAOC 
AGTTGGATGG 
GTGCTATTCT 
QCACTAATGG 
ATCCTGAACT 
AAGAGG6AAA 
ACCAAATCA6 
OGAAATACAA 
A6Q6TGATGT 
CAOAAA AAGA 
AAGGTACntr 
ACTTGTCQGG 
GTTTATTGAC 
CAACTTCTGC 
AAAAOOCAOA 
AAQATTCAAC 
TGTGGTTTCC 
GCTTTCTCCA 
CCTTTTCTGC 



11 

I 

CAOGCAOGAT 
ATTTCCTTCG 
CCX3CAGACC6 
GTGTTT6C06 
AGATTGQCTG 
ATGTAATAGC 
GAATCTTAAG 
TCATAACACT 
AGTTTCA6AA 
GTCATCTGAT 
CTACTGCTTT 
GTTAAGAGCT 
GATTATTGAT 
CATACTGTTG 
GACATCTCCT 
CTCTGAAAGC 
CATGCTGATO 
GOTGTTTTCC 
AGAAAATGTT 
T0GAGTC6TT 
AGAGGACCAA 
CAATAATTTG 
CTTA TATGG A 
TGATCITTTC 
AGACATTGAA 
GAAAAAGGAA 
TGAAGCCAAG 
TCCCAA TACA 
TATTTCCTTO 
AGCCTCTTTA 
GACTGCAGAA 
CAGTTTCAAG 
TATCCCATTC 
GAGAATAACA 
TTCATCAOGT 
TAGCTCTACA 
6ACTAATTAC 
AGGCCCAGTG 



21 
I 

CTCACTTOGA 
CTCXX:CCTCC 
TCTG GAAATQ 
CCTQGATTGQ 
GTOCIATACA 
OCAAAACAAT 
AAACTTAAAT 
GGGAAAACAS 
AT6GTGTTTA 
GGATCAGAGC 
GATGCGGACC 
TTATCCATTT 
6GAGTCX3AAA 
AACCTTCT6C 
CCCTGCACA6 
CAGTTGGCTS 
GACTACTTAC 
TCATACACTG 
CAQOCTGACC 
TATGATACCA 
ACCAAGCATQ 
CTACCCAATA 
AAATACAG03 
GCTGAATTAA 
GAA6G0GCTA 
CCCCAGATTT 
ACTAAOCGAT 
TCTTTAAATT 
ACTTCTCAGA 
AATGAT66CT 
TCCTTAAATA 
CTTGATACTG 
ATCTCPGAGA 
TATGATOTOC 
TCA6AAGAAT 
GACATAACAG 
ACTGAGATAC 
ATGTCACAGG 



31 
I 

TCTATACACT 
CTCTCCACTC 
CX3AATCCTAA 
OCTAATGGAT 
GGAGCACTGA 
CTCCTATCAA 
TTCAGG6TTG 
TGGAAATTAA 
AAGCAAGCAA 
ATAGTTTAGA 
GATTTTCAAG 
TGTTTGAGGT 
GTGTTAGTOQ 
CAAA CTCAA C 
ACACAGTTGA 
TTTTTTGTGA 
AAAACAATTT 
GAAAGGAAGA 
CAGAGAATTA 
TGATTGAGAA 
AATTTTTGAC 
TQAGTTATGT 
AGCAACTGAT 
TTGGAACTGA 
TTGTGAATOC 
CTACC31C3^C 
CCCCAACAAG 
GCACTTCCCA 
CTGTGACI6A 
CTAAAACTGT 
CAGTTTCTAT 
GAGCTGAAGA 
ACATATCCCA 
TTATACCAGA 
CACTAAAGGA 
CACAGOCOGA 
GTGTTGATGA 
GTCCCTCAGT 



41 

I 

GGAGGATTAA 
TGAGAAGCAO 
AGCGTTTCCT 
ACTACAGACA 
ATCAAAAAAT 
TATTGATGAA 
QGATAAAACA 
TCTCACTAAT 
GATAACTTTT 
AGGACAAAAA 
TTTTGAGGAA 
TGGGACAGAA 
TTTTGGGAAQ 
TGACAAGTAT 
CTGGATTGTT 
AGTTCTTACA 
TOGAGAGCAA 
GATTCAT6AA 
TACCAGCCTT 
GTTT6CAGTT 
AGATG6CTAT 
TCTTCAGATA 
T6T0GACAT6 
AQAAATAATC 
TGGIAGAGAC 
ACACTACAAT 
AGGAA6TGAA 
AOCAGTCACT 
ACIGOCAOCT 
TCTTAGATCT 
AACAGAATAT 
TTCTTCAGGC 
A6GGTATATA 
ATCTGCTAGA 
TCCTTCTATQ 
TGTTGGATCA 
ATCTGAGAA6 
TACAGATCTG 



4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 



AATGCTTCOG 
GAOGGAAATG 
6GCAGAGAGA 
ACAACCAAGT 
GAAATGCCAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 



407 
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WO 02/086443 

ATTATTCTAC CTTTQCCTAC TTGCCAACTG AGQTAACAOC 
OCASAOUVCA GQATTTGGTC TCCA06GTCA AOGTt^TATA 
TATACAATGA 60CCAGTAAT AGTAGCCATG AGTCT06TAT 
AATCOGAGAA GAAGGCAGTT ATACXXXH-TG TC3ATCGTGTC 
TAGTGGTTCT TGTG6GTATT CTCATCTACT GGAGGAAATO 
ACTTAGAjSGA OVGTACATCC OCTAGAGTTA TATCCACAOC 
TTTCAGATGA TGTCGGA6CA ATTCCAATAA AGCACTTTCC 
ATGCAAGTAG TGGGTTTACT GAAQAATTTG AGACACTGAA 
AGAGCTGTAC TGTTGACTTA GGTATTACAG CAGACAGCTC 
ACAAGAAT08 ATACATAAAT ATC6T7GCCT AT6ATCATAG 
TTGCTGAAAA GQATGGCAAA CTGACTGATT ATATCAAXGC 
ACA6ACCAAA AGCTTATATT GCTGCOCAAG GOCCACTGAA 
GGAGAATGAT ATGGGAACAT AATGTGGAAG TTATTGTCAT 
AAGGAAGGAO AAAATGTGAT CAGTACTGGC CTGCCGATGG 
TTCTGGTCAC TCAGAAGAGT GTGCAAC3TGC TTGCCTATTA 
TAAGAAACAC AAAAATAAAA AAGG6CTCCC AGAAAGGAAG 
CACAGTATCA CTACAGGCAG TGGCCTGACA TGGGAGTAOC 
TGAGCTTTGT GAGAAAG6CA GCCTAT6CCA AG06CCATGC 
ACroCAGTGC TGGAGTTGGA AGAACAGGCA CATATATT6T 
AGATTCAACA CGAAGGAACT 6TCAACATAT TTG6CTTCTT 
6AAATTATTT GGTACAAACT GAGGASCAAT ATGTCTTOIT 
CCATACTTA6 TAAAGAAACT GAGGTGCT6G ACA6TCATAT 
TCCTCATTCC TGGACCAGCA GGCAAAACAA AGCTAGA6AA 
AGTCAAATAT ACAGCAGA6T GACTATTCTG CAGCCCTAAA 
ATOGAACTTC TTCTATCATC CCTGTGGAAA GATCAAGGGT 
GAGAAGGCAC AGACTACATC AATGCCTCCT ATATCATGGG 
-TCATCATTAC CCAGCACCCT CTCCTTCATA CCATCAAGGA 
ACCATAATGC CCAACTGGTG GTTATGATTC CTGATGGCCA 
rrGTTTACTG GCX3U\ATAAA GATGAGCCTA TAAATTGTGA 
TGGCTGAAGA ACACAAATGT CTATCTAATG AGGAAAAACT 
TAGAAOCTAC ACAGGATGAT TATGTACTTG AAGTGAGGCA 
CAAATCCA6A TAGCCCCATT A6TAAAACTT TTGAACTTAT 
CTGOCAATAG GGATGGGOCT ATGATTGTTC ATGKTGA6CA 

cvri ' cxirfuc tctgacaaoc cttatgcacc aactagaaaa 
acca66tagc caagatgatc aatctgatga ggcca6qagt 
atcagtttct ctacaaagtg atcctcagcc ttgtgagcac 

CCACCTCTCT GGACAGTAAT G6T6CAGCAT TGCCTGATGG 
AQTCTTTAOT TTAACACAOA AAG0Q6TGQ0 GGGACTCACA 
TCCTAAAATT AGGCAGGAAA ATCAGTCTAG TTCTGTTATC 
ACAGTAACTT TCATQACATA GGATTCTGCC GCCAAATTTA 
TTTTTGCAAG WiTlTCTAATT TACTTATTAT GTTTQAACTA 
ATTTCTAAGA ATGGAATTGT GGTATTTTTT TCTGTATTGA 
TTATASAGGT TAGGAATTCC AAACTACAGA AAATGTTTGT 
CTGTATTTGT AGCAATTATC AG GTrf GCTA GAAATATAAC 
AATAAAACAC TCTTCCATAT GATATTCAAC ATTTTACAAC 
GAAATAATCT GTTACrrATT OTAAATACTG CCCTAGTCTC 
TTATAATTGT AGATTTTTAT ATTTTACTAC TGAGTCAAGT 
TTTAOTTTAA TGAOOTAGTT CATTAGCTQO TCTTACVCTA 
TGT6TTACCT AAGTCATTAA CTTTOTTTCA GCATGTAATT 
AAATACCTTC ATTTTGAAAG AAGTTTTTAT GAGAATAACA 
ATGGTTTTTA TCCAAGGAAT TGCAAAAATA AATATAAATA 
AAAAAAAAAA AAAAAAAAAA AAA 



PCT/US02/12476 



TCATGCTTTT 
CT06CAGACA 
TGGTCTA6CT 
AGCCCTGACT 
CTTCCAGACT 
TCCAACACCT 
AAAGCATGTT 
AGA6TTTTAC 
CAACCACCXA 
CAGGGTTAAO 
CAATTATOTT 
ATCCACAGCT 
GATAACAAAC 
GAGTGAGSVS 
TACTGTGAGG 
AGCCAGTGGA 
AQA6TACT0C 
AGTGGGGCCT 
GCTAGACAGT 
AAAACAOVTC 
TCATGATACA 
TCATGCCTAT 
ACAATTCCAG 
GCAATGCAAC 
TGGCATTTCA 
CTATTACCAG 
TTTCTGGAGG 
AAACATGGCA 
GAGCTTTAAG 
TATAATTCAG 
CTTTCAGTGT 
AAGTGTTATA 
TGGAGGAGTG 
AGAAAATTCC 
CTTTGCTGAC 
AAGGCAGGAA 
AAATATAGCr 
TCTGAGCATr 
TGTTGATTTC 
TATCATTAAC 
AAATGATTGA 
TTTTAACAGA 
TTTTAGTGTC 
TTTTAATACA 
TGCAGTATTC 
TCCATGGACC 
TTTCTAGTTC 
CCAGTTTTCT 
TTAACTTTTO 
CCTTACCAAA 
TtOCCArTAA 



ACGOCATCCr 
ACCC AACOGG 
GAGGGGTT6G 
TTTATCTGTC 
GCACACTTTT 
ATCTTTCXAA 
GCAGATTTAC 
CAGGAAGTGC 
GACAACAAGC 
CTAGCACA6C 
GATGOCTACA 
GAAGATTTCT 
CTCGTGGAGA 
TACGGGAACT 
AATTTTACTC 
OSTGTGOTCA 
CTGCCAGTGC 
GTTGTCGTOC 
ATGTTGCaVGC 
OTITCACAAA 
CXGGTTGAGO 
GTTAATGC3VC 
CTCCTGAGCC 
AGGGAAAAGA 
TCCCTGAGTG 
AGCAATGAAT 
ATGATATGGG 
GAA6ATGAAT 
GTCACTCTTA 
GACTTTATCT 
CCTAAATGGC 
AAAGAAGAAG 
ACGG CAGGAA 
GTGGATGTTT 
ATTGAGCAGT 
GAGAATCCAT 
GAGAGCTTAG 
GTTTTCCTCT 
CCATCACCTG 
AATGTGTGCC 
ATTTTACAGT 
AAATTTCAAT 
AAATTTTTAO 
GTAGGCTOTA 
ACCTAAA6TA 
AAATTTATAT 
TGTGTAATTG 
GACATTGTAT 
TOGAAAATAO 
CATTGTTCAA 
AAAAAAAAAA 



Seq ID NO: 579 Protein sequence t 
Protein Acceseioa #: BOS sequence 



1 
I 

MVPKASXITF 
LSILFBVGTE 
PCTDTVDWrv 
SYTGKEBIHB 
TKBEFLTDQY 
PELZGTBBZZ 
TNR5PTRGSB 
NDGSKTVLRS 
ISENISQGYI 
DZTAQPOVOS 
FPTBVTFBAF 
IPLVIVSALT 
IPIKHPPKHV 
IVATOHSRVK 
MVEVXVMIXN 
XGSQK6RPSG 
RTGTYIVLDS 
EVLDSaiHAY 
PVERSRV6ZS 
VMIPDG^MA 
yVLEVRHFQC 



11 

1 

RWGKOmSSD 
ENLDPKAIID 
FKDTVSISES 
AVCSSEPENV 
QDLGAIUIML 



21 



GAAZJDGNZA 



PSGKGDVPNT 
PHMNLSGTAE 
FSSEHPETIT 
GRESFLQTNY 
TPSSRQQDLV 
FICLWLVGI 
ADLHASSGFT 
LAQLAEKDGK 
LVEKGRRKCD 

RwroyHYTQ 

KliQOIQHEGT 
VMALLIPGPA 
SLSGBGTDYZ 
EDSFVYHFlOf 
PKHPNPPSFZ 
VDVYQVAXMI 
ESXjESIiV 



GVESVS31FGK 
QLAVPCBVLT 
QADPENYTSL 
LPNMSYVLQI 
EGAZVNFGRD 
SLNSTSQPVT 
StNTVSITBY 
YDVLIPESAR 
TBIRVDBSEK 
STVNWYSQT 
LIYWRKCFQT 
EEPETLKEFY 
LTDYXHAKYV 
QYWPAZXSSEB 
WPDMavPEYS 
VNIFGFLKRI 
GKTIOEKQPQ 
HASyiHGYYQ 
DEPIHCBSFK 
SKTPSXiZSVZ 
NLMRPGVFAD 



31 
I 

FPLEKQZYCP 

OAALDPFILL 
MQQSGyVMU4 
LVTHERFRW 
VAXCIMGLYO 
SATNQIRKKE 
KLATEXDZSL 
EEBSLLTSFK 



41 



TTKSFSAGFV 
TQPVYNEASZI 
AEFYLEDST8 
QEVQ8CTVDL 
DGXNRPKAYI 
Y(9IFLVTQKS 
LPVLTFVRKA 
RSQRNYLVQT 
LLSQSNIQQS 
SNEFIZTQHP 
VTLMABEHKC 
KEEAANRD6P 
lEQYQFLYKV 



miliPNSTDKY 
DYLQIJNFRBQ 
YDTNIEKFAV 
KYS DQLIVD M 
PQZSTTTHYK 
TSQTVTELPP 
LDTGAEDSSG 
SEBSliKDPSM 
NSQQPSVTDL 
SSHBSRIGLA 
PRVISTPPTP 
GITADSSNHP 
AAQGPIiKSTA 
VQfVXAYYTVR 
AYAKRBAVGP 
EEQYVFIHDT 
DYSAALKQCM 
LIiBTZKDFHR 
LSEISBKZilZQ 
MIVBD&H8GV 
ILSLVSTRQB 



51 
I 

AVKSXGKLRA 
YIYNGSLTSP 
QYKPSRQVPS 
LYQQLDGEDQ 
PTDNFEU)LF 
RZGTKYNEAK 
HTVEGTSASL 
SSPATSAIPP 
B6NVWFPS8T 
ENraYSTFAY 



IFFISDDVGA 
X3NKHXNRYIN 
BDFNRMIflEH 
HFTLRIITKIK 
WVHCSAGVO 
IiVEAIItSKET 
REKKRTSSZI 
MIKDHKAQLV 
OPZUBATQDD 
XAGTFCAIiIT 
BfPSTSU>SN 



2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2620 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4660 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
760 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



Seq ID KO: 580 DNA sequence 

Nucleic Acid Accession EOS sequence 

Coding sequence: 148 '46 32' 



11 



21 



31 



51 



408 



wo 02/086443 

1 i I 1 I I 

O^CACATAiOG CAC6CAC6AT CTCACTT06A TCTATACACT GGAGGAT7AA AACAAACAAA 60 

C3UUVAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCJWrrC TGAGAAGCAG AGGAGCOra 120 

OGGCXSAGGGG CCGCAGACCG TCTGGAAATG C3GAATCCTAA AAOGTTTCCT CGCTTGCATT IBO 

CAGCTCCTCT GT C 'l'lTGCXX S CCTQGATTGG GCTAATGGAT ACTACASACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA QGA6CACT8A ATCAAAAAAA TTGGGGAAAO 300 

AAATATCCAA CATGTAATAO CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGtAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACOST 480 

GTCAGCGGAG GAGTTTCAGA AATG6TGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAAZGCAATA TGTCATCIGA TGGATCAGAB GATAOTTTAG AAG6ACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TQAT80GGAC OQATTTTOA GTTTT(SU3GA ABCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGA<3^GA AGAAAATTTG 720 

GATTTCAAAG CQATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACXTTTCTG CCAAACTCAA CTGACAAGTA TTACaTTTAC 040 

AATG6CTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTQGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTQAAAQ CCAGTTGGCT G TT mTU ' l ' G AAOTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAfiAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGT7CACSAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACX»GCX:T TCTTGTTACA 1140 

TGGGAAAGAC CTGGA6TCGT TTATQATACC ATGATTGAGA AGTTT6CAGT TTT6ZACCA0 1200 

CAGTTGGATG GACAG6ACCA AACCAAGCAT GAATTTTTGA CAGATG6CTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCXTTGAATTA ATTG6AACTG AAGAAATAAT CAA6GAGGAG 1440 

GAAGAGGGAA AAGACATT6A AGAA06CGCT ATTGTGAATC CTGGTAGAOA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TOGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

QAAGGTACTT CAGCXTTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAG6A0 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTA-TCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA OGTGTTGATG AATCTGAGAA GACAACX^UVG 2220 

TCCTTTTCTG CAGGCCCAGT GATXSTCACAG GGTCCCTCAG TTAC3M3ATCT 6GAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT ThCCCCIiTCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG AGGCCAGTAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAGQGGTTG 2460 

GAATCCGAGA AGAAOGCAGT TATACCCCTT GTGATCGTGT CAOCCCTGAC TTTTATCTGT 2520 

CTAGTGGTTC TTGT G GGTAT TCTCATCTAC TG6AGGAAAT GCTTOCAGAC TGCACACTTT 2580 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTOCAACACC TATCTTTCCA 2640 

ATTTCAGATG ATGTOSGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAQATTTA 2700 

CATGCAAOTA GTO8GTTTAC TCAAGAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 2760 

CA6A0CT6TA CTGTTGACTT AGGTATTACA 6CAGACA6CT CCAACXrACCC AGACAACAAG 2820 

CACAAGAATC 6AXACATAAA TATOGTTGGC TAT6ATCATA 6CA0GGTTAA GCTAGCACAG 2880 

CTTGCTGAAA AGGATGGCAA ACTGACTGAT TATATCAATG CCAATTATGT TGATGGCTAC 2940 

AACAGACCAA AAGCTTATAT TOCTGCCCaA GGCCCACTGA AATCCACAGC TGAAGATTTC 3000 

TGGAGAATGA TATGGGAACA TAAT6TGGAA GTTATTGTCA TGATAACAAA CCTOGTGGAG 3060 

AAAGQAAOGA GAAAATGTGA TCAGTACTGO CCIGCOGATG GGAQTGAGGA OTAOGGGAAC 3120 

TTTCTGGTOl CICAGAASAG TGTCCAASTG CTTGCCTATT ATACTGXXSAG GAATTTTACT 3180 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA GACCCAGTGG A0GTGT6GTC 3240 

ACACAGTATC ACTACACGCA GTGGCCTGAC AT6GGAGTAC C3W5AGTACTC CXTTGCCAGTC 3300 

CTGACCTTTG TGAGAAAGGC AGOCTATGCC AAGOGCCATG CAGTGGGGCC TGTTGTCGTC 3360 

CACIGCAOTG CTGQAGTTGO AAGAACAGGC ACATATATT6 TGCTAGACAG TATGTTQCAG 3420 

CAGATTCAAC AC6AAGGAAC TGTCAACATA TTTGGCTTCT TAAAACACAT CXXTTTCACAA 3460 

AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTOGTTGAG 3540 

GCCATACTTA GTAAAGAAAC TGAGGTGCTG GACAGTCATA TTCATGCCTA TGTTAATGCA 3600 

CrCCTCATTC CTGGACCA6C AGGCAAAACA AAOCTAGAGA AACAATTCCA GGGTCTCACT 3660 

CTGTCACOCA GGCTGGAGTG CAGAGGCACA ATCTGGGCTC ACTGCAACCT TGCTCTCCXrr 3720 

GGCTTAACT6 ATCCTGCTAC CTCAGCCTC!C CGA6TGGCTG GGACTATACT CCTGA6CCAG 3780 

TCAAATATAC AGCAGAGTGA CTATTCTGCA GCXXrTAAAGC AATGCAACAG GGAAAAGAAT 3840 

CGAACTTCTT CTATCATCCC TGTGGAAAGA TCAAGGGTTO GCATTTCATC CCT6A6TGGA 3900 

GAAGGCACAG ACTACATCAA TGCCTCCTAT ATCATGGGCT ATTACCAGA6 CAATGAATTC 3960 

ATCATTACCC AGCACOCTCT CCTTCATACC ATCAA6GAT7 TCTGGAGGAT GATATGGGAC 4020 

CATAATGCCC AACTOGTGGT TATGATTCCT GATOGCXAAA ACATOGCAGA AGATGAATTT 4080 

6TTTACTGGC CAAATAAAGA T6AGCCTATA AATTGTGAGA GCTTTAAGGT CACTCTTATO 4140 

GCTGAAGAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 4200 

GAAGCTACAC AGGATGATTA TGTACTT6AA GTGAGGCACT TTCAGTGTCC TAAATGGCCA 4260 

AATOCAGATA GOCXCATTAG TAAAACTTTT GAACTIAZAA GTOTTATAAA AGAAGAA6CT 4320 

GCCAATAGQG ATGGGCCTAT GATTGTTCAT GATGA6CAT6 6AGGA6TGAC GGCAGGAACT 4380 

TTCTGTGCTC TGACAACCCT TATGCACCAA CTAGAAAAAG AAAATTCOGT QGATGTTTAC 4440 

CAGGTAGCCA AGATGATCAA TCTGATGAGG CX3VGGAGTCT TTGCTGACAT TGAGCAGTAT 4500 

CAGTTTCTCT ACAAAGTGAT OCTCAGCCTT GTGGGCACAA QGCAGGAAGA GAATCCATCC 4560 

ACCrCTCTGO ACAGTAATGQ TQCAGCATTO OCTOATGGAA ATATA6CT6A GA6CTTA6AG 4620 

TCTTTAGTTT AACACAGAAA 0QGGTGGGG6 6ACTCACATC T6AGCATT6T TTTCCTCTTC 4680 

CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CPGTTATCTO TTGATTTCCC ATCACCTGAC 4740 

AGTAACTTTC ATGACATAGG ATTCTGCOGC CAAATTTATA TCATTAACAA TGTGTGCCTT 4800 

TTTGCAA6AC TTGTAATTXA CTTATTATGT TTGAACIAAA AT6ATT6AAT TTTACAGTAT 4860 

TTCIAAGAAT GGAATTOTGO TATTTTTTTC T8TATTGATT TXAACA6AAA ATTTCAATTT 4920 

ATAO^TTA GQAATTCCAA ACTACAOAAA ATGTrTGTTT TTAGTGTCAA ATTTTTAGCT 4980 

GTATTTGTAG CAATTATCAG GTTTGCTAGA AATATAACTT TTAATACAGT AGCCTGTAAA 5040 

TAAAACACTC TTCCATATGA TATTCAACAT TTTACAACTG CAGTATTCAC CTAAAGTAGA 5100 

AATAATCTGT TACTTATTCT AAATACTGCC CTAGTGTCTC CATOQACX3UI ATTTATATTT 5160 
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ATAATTGTAO ATTTTTATAT TTTACTACTG AGTCAAGTTT rc tA Ur n \: £G TGTAATTGrT 
TAGTTTAATG A0GTW3TTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CATTGTATT6 
TOTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTTT AACTTTTGTG GAAAA TAGRA 
ATACCTTCAT TTTGAAAGAA GTTTTTATGA GAATAACACC TTACCAAACA TTGTTCAAAT 
GGTTTTTATC CAAGGAATTQ CAAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA 
AAAAAAIVAAA AAAAAAAAAA A 
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Seq ID NO: 581 Protein sequence: 
Protein Accession i; EOS sequence 



1 
I 

MRILKRFLAC 
QSPZMZDEDL 
PKASKITFHW 
ILFEVGTEEU 
TDTVDWrVPK 
TGKEBZBEAV 
HEFLTDGYOD 
LI6TEBIIRB 
RSPTRGSEP8 
GSKTVhRSPH 
EtnSQGyiPS 
TAQP0V6S6R 
TEVTPHAPTP 
LVrVSALTPI 
IKHFPKHVAO 
AYDHSRVKIA 
BVIVMItWLV 
SQK6RPS6RV 
GTYIVUJSML 
LDSHIBAYVN 
SRVAGTILLS 
YIM6YYQSME 
IKCESPKVTL 
FELISVIKEE 
RPGVFADIBQ 



I 

TQVRVNLKKL 
GKCNM5S0GS 
LDPKAIIDGV 
OTVSISESQL 
CSSEPENVQA 
U3AILNNLLP 



GKGDVPNTSL 
MNLSGTAESL 
SENPETXTYD 
BSFIiQTNyTB 
SSRQQDLVST 
CLWLVGILI 
LHASSGFTEE 
QLAEXDGKLT 



VTQYHYTQWP 
QQIQHEG7VN 
AUilPGFAGK 
QSNIQQSOYS 
FIITQHPLia 
MAESOCCLSN 
AANRDGPMIV 
YQFLYKVIZiS 



21 
I 

NANGYYRQQR 
KFQGKDKTSL 
EESLEGQKFP 
ESVSRFGKQA 
AVPCEVLTMQ 
DPEHYTSXiLV 
NMSyVLQZVA 
AIVNPGRDSA 
NSTSQPVTKL 
NTVSITEYEB 
VLIPES ARHA 
ZRVDESEKTT 
VNWYSQTTQ 
YWRKCPQTAH 
FETItKEFYOB 
DYZNANyVDO 
WPADGSBBYG 
DMGVPEYGLP 
IFGFLKHIRS 
TKLEKQFQ6L 
AALKQC37RSK 
TIKDFWRMZN 
EEKLIIQDFI 
HDSHGGVTAG 
LVGTRQEEKP 



31 
I 

KLVEEIGWSY 
EHTFimiTGK 
LOtQIYCFDA 
AUJPPILI^ 
QSGYVMLMDY 
TWERPRWYD 
ICTNGLY6KY 
TNQIRKKEPQ 
ATEKDISLTS 
ESLLTSFKLD 



KSFSA6PVMS 
PVYNEASNSS 
FYLEDSTSPR 
VQSCTVDU3I 
YMRPKAYIAA 
NPLVTQKSVQ 
VLTFVRKAAY 
QRNYLVQTEE 
TLSPRLECRG 
NRTSSZIPVB 
DHNAQLWMZ 
LEATQDDYVIi 
TPCALTTLMH 
STSLDSNGAA 



Seq ID NO: 562 DKA sequence 

Nucleic Acid Accession ftt NM_002B51.1 

Coding sequence: 148.. 7092 



CACACATAOG 
CAAAAAAAAC 
GGG0GAG6GG 
CAGCTCCTCT 
CTTGTTGAAG 
AAATATCCAA 
CAAGTAAATG 
AACACATTCA 
GTCAOCGGAG 
AAATGCAATA 
GAGAT6CAAA 
GGAAAAGGGA 
GATTTCAAAG 
TTAGATCCAT 
AATGGCTCAT 
ACAGTTAGCA 
TCTOGTTATG 
TTCTCTAGAC 
AGTTCA6AAC 
TGGGAAA6AC 
CAGTTG6ATG 
GGTGCTATTC 
TGCACTAATX3 
AATCCT6AAC 
GAAGAGG6AA 
AAGCAAATCA 
AOSAAATACA 
AAGGGTGATG 
ACA6AAAAAG 
GA AGOTACT T 
AACTT0T060 
AGTTTATTGA 
GCAACTTCTG 
GAAAA0CCA6 
GAAGATTCAA 
GTGTGGTTTC 
AGCTTTCTCC 
TCCTTTTCTG 
CATTATTCTA 
TOCAGACAAC 
GTATACAATG 
AOCCnTTGT 



11 

1 

CACGCACX3AT 
ATTTCCTTOG 
OOSCAG A OOB 
GTGTTT6CXX3 
AGATTGGCTG 
CATGTAATAG 
TGAATCTTAA 
TTCATAACAC 
GAGTTTCAGA 
TGTCATCTGA 
TCTACTGCTT 
AGTTAAGAGC 
OGATTATTGA 
TCATACTGTT 
TGACATCTCC 
TCTCTGAAAG 
TCATGCTGAT 
AGGTGTTTTC 
CAGAAAATGT 
CTOGAGTOGT 
GAGAGGACCA 
TCAATAATTT 
GCTTATATGG 
TTGATCTTTT 
AAGACATTGA 
GGAAAAA6GA 
ATGAAGCCAA 
TTCCCAATAC 
ATATTTCCTT 
CAGCCTCTTT 
GGA CTGC AQA 
CCAQTTTCAA 
CTATCCCATT 
A6ACAATAAC 
CTTCATCAOG 
CIAGCTCIAC 
A6ACTAATTA 
CAGGCrCAGT 
OCTTTGCCTA 
AGGATTTGGT 
GTGAGACACC 
TQCTTQACAA 



21 

1 

CTCACTT06A 
CTCCGOCTCC 
TCTQQAAATO 
CCTGQATTGO 
GTCCTATACA 
CCCAAAACAA 
GAAACTTAAA 
TGGQAA AACA 
AATGGTGTTT 
TG6ATCAGAG 
TGATGOGGAC 
TTTATCCATT 
TGGAGTCX2AA 
GAAOCTTCTO 
TCCCTGCACA 
CCAGTTGGCT 
GGACTACTTA 
CTCATACACT 
TCAGGCTGAC 
TTATQATACC 
AACCAAGCAT 
GCTAOOCAAT 
AAAATACAGC 
CCCTGAATTA 
AGAAGGGGCT 
ACC0CA6ATT 
GACTAACOGA 
ATCTTTAAAT 
GACTTCrCAG 
AAATGATGGC 
ATCCTTAAAT 
GCTTGATACT 
CATCTCTGAG 
ATATGATGTC 
TTCAOAAOAA 
AGACATAACA 
CACTGAGATA 
GAT6TCACA6 
CTTCCCAACT 
CTOCACGGTC 
TCTTCAACCr 
TCAGATCCrC 



31 
I 

TCTATACACT 
CTCTCCACTC 
OGAATCCTAA 
GCTAAT6GAT 
GGAGCACTGA 
TCTCCTATCA 
TTTCAGGGTT 
GTQQAAATTA 
AAAGCAAGCA 
CATAGTTTAG 
OGATTTTCAA 
TTGTTTGAGO 
AGTGTTAGTC 
CCAAACTCAA 
GACACAGTTG 
GT ' l T rntjiXj 
CAAAACAATT 
GGAAAGGAAG 
CCAGAGAATT 
ATGATT6AGA 
GAATTTTTGA 
ATGAGTTATG 
GACCAACTGA 
ATTGGAACTG 
ATTOTGAATC 
TCTACCACAA 
TCCCCAACAA 
TCCACTTCCC 
ACTGTGACTG 
TCTAAAACTG 

ACMynrcTA 

GGAGCTGAAG 
AACATATCCC 
CTTATAOCAG 
TCACTAAAGG 
6CACAGCC06 
OGTGTTGATG 
GGTCCCTCAQ 
GA QOTAAC AC 
AA0GTQ8XAT 
TOCTACAGTA 
AACACtACCC 



41 

I 

TGAMiKNWQ 
TVEINLTNDY 
DRPSSFEEAV 
LPKSTDKYYI 
LQNNFRBQQY 
TMXSKFAVLY 
SDQLZVOMPT 
ISTTTHYNRI 
QTVTELPPHT 
TGAEDSSGSS 
BSLXDPSMBG 
QGPSVTDXjEM 
RESSZGLAEG 
VISTPPTPIP 
TADSSNHPDN 
QGPZiKSTAED 
VLAYYTVRKP 
AKKHAVGPW 
QYVFIHDTLV 
TISAHCNLPL 
RSRVGISSLS 
PDGQNMAEDB 
EVHHFQCPKW 
QLEKENSVDV 
LPDGKIAESL 



41 
I 

GGAGGATTAA 
TGAGAAGCAO 
AGOSTTTCCT 
ACTACAGACA 
ATCAAAAAAA 
ATATTGATGA 
GGGATAAAAC 
ATCTCACTAA 
AGATAACTTT 
AAGGACAAAA 
6TTTTGAGGA 
TTGGGACA6A 
GTTTTGGQAA 
CTGACAAGTA 
ACTGGATTGT 
AAGTTCTTAC 
TTCGAGAGCA 
AGA7TCATGA 
ATACCAGCCT 
AGTTTGCAGT 
CAGATGGCTA 
TTCTTCAQAT 
TTGT06ACAT 
AAGAAATAAT 
CTGGTAGAGA 
CACACTACAA 
GAGGAAGTGA 
AACCAGTCAC 
AACTGCCACC 
TTCTTAGATC 
TAACAGAATA 
ATTCTTCAGG 
AAGGGTATAT 
AATCTGCTAG 
ATCCTTCTAT 
ATGTTGGATC 
AATCTGAGAA 
TTACAGATCT 
CTCATGCT7T 
ACT06CAGAC 
GTGAAGTCTT 
CT6CT6CTTC 



51 
1 

KKYPTCNSPK 
RVSGGVSaw 
KGKGKLRALS 
YNGSLTSPPC 
KFSRQVFSSY 
QQU)6EDQTK 
DNPEU)LPPB 
GTKYNEAKTN 
VEGTSASLND 
PATSAIPFIS 
8VWFPSSTDI 
PBYSTFAYFP 
LESEKKAVIP 
PISDDVGAIP 
KEKNRYIUIV 
FHRMZHEBNV 
TUtNTKIKKG 
VECSAGVGRT 
EAILSKETBV 
PGLTDPPTSA 
GB6TDYZ11AS 
FVYWPNKDEP 
PITPDSPISKT 
YQVARMmiM 
BSLV 



51 
1 

AACAAACAAA 
AGGAGCCGCA 
OGCTTOCATT 
ACASAGAAAA 
TTGGGGAAA6 
AGATCTTACA 
ATCATTGGAA 
TGACTACCGT 
TCACTGGGGA 
ATTTCCACTT 
AGCAGTCAAA 
AGAAAATTTG 
GCAGGCTGCt 
TTACATTTAC 
TTTTAAAGAT 
AAT6CAACAA 
ACAGTACAAG 
AGCAGTTTGT 
TCTTGTTACA 
TTTGTACXaG 
TCAAQACTTG 
AGTAGCCATA 
GOCTACTGAT 
CAAGGAGGAG 
CAGTGCTACA 
T06CATAGGG 
ATTCTCTGGA 
TAAATTAGCC 
TCACACTGTG 
TCCACATATG 
T6AGGAGGAG 
CTCCAGTCOC 
ATTTTCCTOC 
AAATGCTTCC 
GGAGGGAAAT 
AGGCAGAGAG 
GACAACC3UkG 
G6AAATGCCA 
TACCCCATCC 
AACOCAAOOG 
TCCTCTAGTC 
AAGTAGT8AT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
. 900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
IBOO 
1660 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
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T0GG0CTT6C ATGCTAOGCC TGTATTTCOC AGTGTOGATa T GTC A TTTffll ATCCATOCTQ 25B0 

TCTTOCTATO A 1T3GT GCA CC TTTQCTTCCA WITCCitTm CTTOCTTCAO TAGTGAATTG 2640 

TTTCGCCATC TGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCAGC TACCGAGAGT 2700 

QATAAGOTGC CCTTGCATGC TTCTCTGCCA GTGGCTGGGG GTGATTTGCT ATTAGAGCOC 2760 

AGOCTTOCTC AGTATTCTGA TGTGCTGTCC ACTACTCATG CTGCrTCftGA GACGCTGGAA 2820 

TTT0GTAGT6 AATCIGGTGT TCTTTATAAA AOGCTTATGT TTTCTCAAGT TGAACOOOC 28S0 

AGCA6T6ATG CCATGAT6CA TGCAO G TTCT TCAGGGCCIG AA OCTiW A TG C CI'TGTCT 2940 

GATAATGAGG GCTCCCAACA CATCTTCACT GTTTCTTACA GTTCTGCAAT ACCTGTGCAT 3000 

GATTCTGTGG GTGTAACTTA TCAGGGTTCC TTATTTAGOG GCCCTAGCCA TATAOCAATA 3060 

OCTAAGTCTT OGTTAATAAC COCAACTGCA TCATTACTGC AGOCTACTCA TGCCCTCTCT 3120 

0GTGATQ0G6 AATGGTCT66 AGCCTCTTCT GATAGTGAAT TTCTTTXACC TGACACAGAT 3180 

OOGCTGACAG CCCTTAACAT TTCTTCACCT GTTTCTOTAO CTGAATTTAC ATATACAACA 3240 

TCTGTGTTTG GTGATGATAA TAAGGOGCTT TCTAAAAGTQ AAATAATATA TGGAAATGAG 3300 

ACTGAACTGC AAATTCCTTC TTTCAATGAG ATGGTTTACC CTTCTGAAAG CACAGTCATG 3360 

CCCAACAT6T ATGATAATGT AAATAAGTTG AATGCGTCTT TACAAGAAAC CTCTGTTTCC 3420 

ATTTCTAGCA CCAAGGGCAT GnTCCAGGG TOOCTTQCTC ATACCAGCAC TAAOGTTTTT 3480 

6ATCAT6AGA TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCAACCTAC ACATACTOTC 3540 

TCTCAAGOVT CTGGTGACAC TTCGCTTAAA CXrTGTGCTTA GTGCAAACTC AGAGCCAGCA 3600 

TCCTCTGACC CTGCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3660 

ACCTCAGCTT CTTTTAGTAC TGAAGTATTG CTACAACCT7 CCTTTCAGGC TTCTGATGTT 3720 

GACACCTT6C TTAAAACTOT TCTTCCAGCT GTGGCCAGTG ATOCAATATT G6TTQAAACC 3780 

(XCAAAGTTG ATAAAATTAG TTCTACAATG TTGCATCTCA TTGTATCAAA TTCT6CTTCA 3840 

AGTGAAAACA TGCTGCACTC TACATCTGTA CGAGTTTTTG ATGTGTOGCC TACTTCTCAT 3900 

ATGCaCTCTG CTTCACTTCA AGGTTT6ACC ATTTCCTATG CAAGTGAGAA ATATGAACCA 3960 

GTTTTGTTAA AAAGTGAAAG TTCCCACCAA GTGQTACCTT CTTTGTACAG TAATGATGAG 4020 

TTGTTCCAAA 0GGCCAA7TT GGAGA7TAAC CAGGCCCATC CCCCAAAAG8 AAG6CATSTA 4080 

TTTGCTACAC CTGrTTTATC AATTGATGAA CCATTAAATA CACTAATAAA TAAGCTTATA .4140 

CATTCOGATG AAATTTTAAC CTCCACCAAA AGTTCTGTTA CTQGTAAGGT ATTTGCTGGT 4200 

ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACTGATC ATTCTGTTCC TATAGGAAAT 4260 

GGGCATGTTG CCATTACAGC TGTTTCTCCC CACAGAGATG GTTCTGTAAC CTCAACAAAG 4320 

TTGCTGTTTC CTTCTAAGGC AACTTCTGA6 CTGAGTCATA GTGCCAAATC TGATGCCGGT 4360 

TTAGTGGGT6 aTGGTGAAGA TGGTGACACT GATGATGAT6 GTGATGATGA TGATGACA6A 4440 

GATA6TOAT6 GCTTATCCAT TCATAAGTGT ATGTCAT6CT CATCCTATAG AGAATCACAG 4500 

GAAAAGGTAA TGAATGATTC AGACACCCAC GAAAACAGTC TTATGGATCA GAATAATCCA 4560 

ATCTCATACT CACTATCTGA GAATTCTGAA GAAGATAATA GAGTCACAAG TGTATCCTCA 4620 

GACAGTCAAA CTQGTATGGA CAGAAGTOCT GGTAAATCAC CATCAGCAAA TGGGCTATCC 4680 

CAAAAGCACA ATGATGGAAA AGAGGAAAAT GACATTCAGA CTGGTAGTGC TCTGCTTCCT 4740 

CTCA0CCCT6 AATCTAAA6C ATGGGCAGTT CTGACAAGTG ATGAAGAAAG TGGATCAGGG 4800 

CAAOOTACCT CAGATAGCCT TAATGAGAAT GAGACTTCCa CAGATTTCAO TTTTGCAGAC 4860 

ACTAATGAAA AAGATGCTGA TGGGATCXmS QCAGCAGGTO ACTCAGAAAT AACTCCTGGA 4920 

TTCCCACAGT CCCCAACATC ATCTGTTACT A6CGAGAACT CAGAAGTGTT OCAOGTTTCA 4980 

6AGGCAGAGG CCAGTAATAG TA6CCATGAG TCTCGTATTG GTCTAGCTGA GGGGTTGGAA 5040 

TCCGAGAAGA AOQCAGTTAT ACCCCTTGTG ATG6TGTCAG CCCTGACTTT TATCTGTCTA 5100 

O l tSQTT C TTO TGGQTATTCT CATCTACTGO AGGAAATGCT TCCAGACTGC ACACTTTTAC 5160 

TTAGAGGACA GTACATCCCC TAGAGTTATA TCCACACCTC C3iACACCTAT CTTTCCAATT 5220 

rCAGATQATG TOGGAGCAAT TCCAATAAAQ CACTTTCCAA AGCATOTTGC AGATTTACAT 5280 

GCAAGTAGTG GGTTTACTGA AGAATTTGAG ACACTGAAAG AGTTTTACCA 6GAAGTGCAG 5340 

AGCTGTACTO TTOACTTAGO TATTACAGCA GACAGCTOCA AOCAOCCAGA CAACAA6CAC 5400 

AAGAATOQAT ACATAAATAT OG T T GC CTAT GATCATAGCA O GG TT A AGCT AOCACAGCTT 5460 

GCT6AAAAGG ATGGCAAACT GACTGATTAT ATCAATGCC3V ATTATGTTGA TGGCTACAAC 5520 

AGACCAAAAG CTTATATTGC TGCCCAAGGC CCACTGAAAT CCACAGCTGA AGATTTCTGG 5580 

AGAATGATAT GGGAACATAA TGTGGAAGTT A7TGTCATGA TAACAAACCT OGTGGAGAAA 5640 

GGAAGGA6AA AATGTQATCA GTACTGGCCT GCOGATGGQA GTGAGGAGTA OOGGAACTTT 5700 

CTGGTCACTC AGAAGAGTGT 6CAAGT6CTT GGCTATTATA CTGTGAGGAA TTTTACTCTA 5760 

AGAAACACAA AAATAAAAAA GG6CTCCCA6 AAAGGAAGAC CCAGTGGA06 TGTGGTCACA 5820 

CAGTATCACT ACACGCAGTG GCCTGACATQ GGAGTACCAG AGTACTCCCT GCCAGTGCTG 5880 

ACCTTTGTGA GAAAGGCAGC CTATGCCAAG CX3CCATGCA0 TGGGGCCTGT TGTCGTCCAC 5940 

TGCACTGCTO OACTTGGAAG AACAGOCACA TATATTGT O C TAGACA6TAT GTTGCAGCA6 6000 

ATTCAACAOS AAGGAACTGT CAACATATTT GGCTTCTTAA AACACATCC6 TTCACAAA6A 6060 

AATTATTTGG TACAAACTGA GGAGCAATAT GTCTTCATTC ATGATACACT GGTTQAOQCC 6120 

ATACTTAGTA AAGAAACTGA GGTGCTGGAC AGTCATATTC ATOCCTATGT TAATGCACTC 6180 

CTCATTCCTG GACCAGCAOG CAAAACAAAO CTAGAGAAAC AATTCCAGCT CCTGAGCCAG 6240 

TCAAATATAC AGCAGAGTGA CTATTCTGCA 6CCCTAAAGC AATGCAACAG GGAAAAGAAT 63O0 

CGAACTTCTT CTATCATCCC TGTGGAAAGA TCAAGGGTTG GCATTTCATC CCTGAGTGGA 6360 

GAAGGCACAG ACTACATCAA T6CCTCCTAT ATCATGGGCT ATTACCAGAO CAAT6AATTC 6420 

ATCATTACCC AGCACCCTCT CCTTCATACC ATCAAGGATT TCTGGAOQAT GATATGGGAC 6480 

CATAATGOCC AACTG6TGGT TATGATTCCT GATOGCCAAA ACATGGCAGA AGATGAATTT 6540 

GTTTACTGGC CAAATAAAGA TGAGOCTAXA AATTGTOAGA GCTTTAAG6T CACTCTTATG 6600 

GCTGAAGAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 6660 

GAAOCTACAC AGGATGATTA TGTACTTGAA GTGAGGCACT TTCAGTGTOC TAAATGOCCA 6720 

AATCCAGATA GCCCCATTAG TAA7VACTTTT GAACTTATAA GTGTTATAAA AGAAGAAGCT 6780 

GCCAATAGGG ATGG6CCTAT GATT6TTCAT GATGAGCATG GAG6AGTSAC QOCA GGAAC T 6840 

TTCTGTGCTC TGACAACCCT TATGCACCAA CXAOAAAAAG AAAATTCGOT GQATGTTTAC 6900 

CAGGTAGCCA AGATGATCAA TCTGATGAGG CCAGOAGTCT TTGCTGACAT T6A6CAGTAT 6960 

CAGTTTCTCT ACAAAGTGAT CCTCAGOCTT GTGAGCACAA GGCAGt»AGA GAATCCATCC 7020 

A C CTCTCTGG ACAGTAATGG TGCAGCATTG CCTGATGGAA ATATAGCTGA GAGCTTAGAG 7080 

TCTTTAGTTT AACACAGAAA GGGGTGGGGG GACTCACATC TOAGCATTGT TTTCCTCTTC 7140 

CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CTGTTATCTS TTQATTTOOC ATCAC C TGAC 7200 

AGTAACTTTC ATGACATAGG ATTCTOOOOC CAAATTTATA TCATTAACAA TGTGTGCCTT 7260 

TTTGCAAGAC TTGTAATTTA CTTATTATGT TTGAACTAAA ATGATTGAAT TTTACAGTAT 7320 

TTCTAAGAAT GGAATTGTQQ TATTTTTTTC TGTATTGATT TTAACAGAAA ATTTCAATTT 7380 

ATAGAGGTTA CGAATTCCAA ACTACAGAAA ATGITTGTTT TTAGTGTCAA ATTTTTAGCT 7440 

GTATTTGTAG CAATTATCAG GTTTGCTAflA AATATAACTT TTAATACAGT AGCCTGTAAA 7500 

TAAAACACTC TTCCATATGA TATTCAACAT TTTACAACT6 CAGTATTCAC CTAAA 6TAGA 7560 

AATAATCTGT TACTTATTGT AAATACTGCC CTAGTGTCTC CATGGACX3VA ATTTA TATTT 7620 

ATAATTGTAG ATTTTTATAT TTTACTACTG AGTCAAGTTT TCTAGTTCTG TGTAATTGTT 7680 

TAGTTTAATG ACGTAGTTCA TIA6CTGGTC TTACTCTACC AGTTTTCTGA CATTGTATTG 7740 
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TOTTACCTAA GTCMTAACT TTGTTTCAGC ATGXAATTTT AACTTTTGT G GAAAATAGAA 
ATACCTTOIT TTTQAAAGMV GTTTTTATGA GAATAACftOC TTAOOUUICA TTGTTCAAAT 
QOTTTTTATC CAAGGJUVTTG CAAAMTAAA TAXAAATA.TT GCCATTAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA A 

Seq ID KOt 563 Protein sequence 
Protein Accession «: NP_002842.1 



PCTAJS02/12476 



7800 
7B60 
7920 



1 
I 

HRILKRFLAC 
QSPXHIOBDL 

PBASKITFHW 
XLFSVGTBBtl 
TDTVDWIVFK 
TGKEBZBBAV 
HBFLTD6YQD 
LIGTEBZIKB 
RSPTRGSEPS 
GSKTVLRSPH 
ENISQGYIPS 
TAQPDVGSGR 
TEVTPHAPTP 
LNTTPAASSS 
ILPQVTSATE 
KTLMFSQVBP 
SLFSGPSHIP 
PVSVABPTYT 
LNASLQBTSV 
KPVLSAtfSEP 
AVPSDPILVE 
TISYASEKra 
EPUJTLINKL 
PRRDGSVTST 



11 
I 

ZQUiCVCRU) 
TQVMVMIiKKL 
GKOmSSDGS 
LDFKAIIDGV 
DTVSI8BSQL 



USAILNNLLP 



PGKSPSANGL 
MBTSTDFSFA 
ESRIGLABGL 
ISTPPTPIPP 
AOSSNHPDNX 
GPIiKSTAEDF 

LAyyrvRNFT 

KSBAVGPWV 
WFIBDTLVE 
AALKQC33SEK 
TIKDPWRMIW 
EEKLZZQDFZ 
HDUUGGVTAG 
LVSTRQEEKP 



GKGDVPNTSL 
MNLS6TAESL 
SaJPETlTYD 
ESTLQimTE 
SSRQQDLVST 
DSALHATPVF 
SDKVPLHASL 
PSSDAMMHAR 
IPKSSLZTPT 
TSVFGDDNXA 
SISSTKGMFP 
ASSDPA5SEM 
TPKVDKZSST 
FVZiLKSESSH 
ZHSDEZLTST 
KLLFPSKATS 
QSKVMNDSDT 
SQKHNDGKEE 
DTMEKDADGZ 
ESEKKAVIPL 
ZSDDVGAIPZ 
HKHRYINIVA 
WRMIWEHHVE 
LRNTKZKKGS 
HC8AGV6RTG 
AZIiSKETSVXf 
NRTSSZZPVE 
DHNAQLWMZ 
LSATQODWL 
TFCALTTLMH 
ST8LDSN6AA 



21 

KFQGHDKTSL 
EHSIiEGQKFP 
ESVSRFGXQA 
AVFCEVL1T4Q 
DPBiyTSIiLV 
MMSYVLQZVA 
AZVNPGRDSA 
NSTSQPVTKL 
NrvSZTEYBE 
VLZFBSARNA 
ZRVDESEKTT 
VNWYSQTTQ 
PSVDVSFESZ 
FVAGGDLUiE 
SSGPEPSYAL 
ASLLQPTKAL 
LSKSBlIYCai 
GSLAHTTTKV 
LSPSTQLLFY 
MItBLZVSNSA 
OyVPSLYSHD 
KSSVTGKVFA 
ELSHSAKSDA 
HEKSLNDQNH 
NDIQTGSALL 
LAAGDSEZTP 
VZVSALTFZC 
RBFPKaVADL 
YDBSRVKLAQ 
VZVMZTNLVE 
QKGRPSGRW 
TYZVLDSNLQ 
D8HZBAYVHA 
RSRVGZSSLS 
PDGQMMAEDE 
EVHHFQCPKW 
QLEXBTSVDV 
LPOGKZAESL 



31 
I 

KLVEEZGKSY 
EtlTFZENTGIC 
LEMQZYCFDA 
ALDPFZLUJIi 
QS6YVMXM3Y 
TWERPRWYD 
ZCTNGItYGICY 
TNQZRKKBPQ 
ATEKDZSLTS 
BSLLTSFKLD 



41 
I 

TGAIjlQKNWG 
TVBZNLTODY 
DRF8SFEEAV 
LPNSTDKYYZ 
LQNHFREQQY 
TMZBKFAVZtY 
SDQtiZVDMPT 
ISTTTHYNRI 
QTVTBLPPHT 



KSFSAGFVMS 
PVYNGBTPliQ 
LSSYDGAPIiL 
PSLAQYSDVIi 



SGD6EWSGAS 
ETBLQZPSFN 
PDHEZSQVPE 
ETSASPSTEV 
6SENMLHSTS 
BLPQTANLEZ 
GIPTVASDTP 
GLVGGGEDGD 
PJSYSLSmS 
PLSPBSKANA 
GFPQSPTSSV 
LWLVGZLZY 
HASSGFTEEF 
LA£KZ)GKLTD 
KGRRKCDQYW 
rOYHYTQHPD 
QZQHBGTVNZ 
LLZPGPAGKT 
GEGTDYZNAS 
FVYWPNKDEP 
PNPDSPZSKT 
YQVAXHZNZM 
BSIiV 



QGPSVTDZjEM 
PSYSSBVFPL 
PFSSASFSSB 
STTHAASETL 
TVSYSSAZFV 
SDSEFLTiPDT 
EMVYPSESTV 
NNFSVQPTHT 
LLQPSFQASD 
VPVPDVSPTS 
NQAEPPKSRH 
VSTDHSVPZO 
TDDDGDDDDD 
SEmSVTSVS 
VLTSDEESGS 
TSENSEVPHV 
WRKCFQTAHP 
ETLKEFYQEV 
YZKANYVDGY 
PADGSEEYGN 
MGVPEYSIiPV 
FGPUCHZRSQ 
KLBKQFQLLS 
YZMGYYQSNB 
INCESPKVTL 
FELZSVZXES 
RP6VFADZEQ 



51 
I 

KKYPTCNSPK 
RVSGGVSEMV 
KGKEKLRALS 
YNGSLTSPPC 
KFSRQVFSSY 
QQUX3BDQTX 
DNPBLDLPPB 
GTRYNEAKTN 
VEGTSASUID 
PATSAZPFIS 
NVWFPSSIDZ 
PHYSTPAYFP 
VTPLLLDNQZ 
LFRHLHTVSQ 
EFGSESGVLY 
HDSV6VTYQQ 
DGLTAU7ZSS 
MPNMYDNVNK 
VSQASGDT6L 
VDTLLKTVLP 
HMHSASLC2GIi 
VFATPVLSZD 
KGHVAZTAV8 
SDSDGLSIHK 
SDSOTGMDRS 
GQGTSDSUra 
SEAEASKSSH 
YLEDSTSPRV 
QSCTVDLGZT 
NRPKAYZAAQ 
FLVTQKSVQV 
L7FVRKAAYA 
RMYLVQTEBQ 
QSNZQQ8DYS 
FZZTQKPLLH 
MAEEHKCLSN 
AANHDGPHZV 
YQFLyKVZLS 



Seq ZD NO: 584 DNA sequence 

Nucleic Acid Accession ft: NM_O05688.1 

Coding sequence t 12 6 .. 443 9 " 



1 

I 

CXX»36CAGGT 
A66GOO0CM3 
AGAAGAT6AA 
GTGTGAGGGA 
GGAGAACTC6 
TCTCTCTTGA 
GAAAGTACCA 
ACGOUn'GGA 
CC C GTGTGGC 
AOSAGTCTTC 
AAGTT6GGCC 
TCATCCZCTC 
TCATGGT6AA 
TGTTGTTAGT 
CTTGGGCATT 
TTAAGAAGAT 
TTTGCTCCAA 
QAGGACCOGT 
GCTTCCTGGG 
TCACAGCATA 
A7GAA6TTCT 
AGAGTGTTCA 
AGG6TATCAC 
CTGTTCATAT 
TCTTCAATTC 
AA6CCTCAGT 
TAAA6AACAA 
G6GACTCCTC 
ACAAGAGGGC 
A6G0G6TGCT 
C0SAA6AGGA 



11 

I 

QGCTCATGCT 
GAATTCTGAT 
GGATAT06AC 
GAGAACCAGC 
ACOGTTGGAA 
TGCCTCCATG 
TCATG6CTT8 
CAAT6CT6GG 
CCACAAGAAG 
T6ACGTGAAC 
AGA06CTGCT 
GATOGTGTGC 
ACACCTCTT6 
GCTGGGCCTC 
GAATTACCGA 
CCTTAAGTTA 
0GATGGQCA6 
TGTTGCCATC 
ATCAGCTGTT 
TTTCAGGAGA 
TACTTACATT 
AAAAATGOQC 
TGTG6GTGTG 
GACCCTGGGC 
CATGACTTTT 
GGCTGTTGAC 
ACCABCCAGT 
CCACTCCA6T 
TTCCAGGGGC 
GGCA6AGCAS 
AGAAGGCAAG 



21 

I 

CGGGAGOGTG 
GTGAAACTAA 
ATAG6AAAA6 
ACTTCT GG GA 
TGCCAAGATG 
CATTCTCAGC 
AGTQCTCTGA 
CTTTTTTCCT 
GGGGAGCTCT 
TGCAGAAGAC 
TCCCTGOGAA 
CTGAT6ATCA 
GA6TATACCC 
CTCCTGACGG 
ACCCS3TGTCC 
AA6AACATTA 
A6AATGTTT6 
TTA06CATGA 
TTTATCCrCT 
AAATGOGTGG 
AAATTTATCA 
GAG8A86AGC 
GCTCCCATT6 
TTOGATCTGA 
GCTTTGAAAG 
AGATTTAAGA 
OCTGACATCA 
ATCCAGAACT 
AAGAAAGAGA 
AAAGGCCACC 
CACATCCACC 



31 

I 

GTTGAGOGGC 
CAGTCTGTGA 
AGTATATCAT 
OGCACAGAGA 
CCTTGQAAAC 
TCAGAATCCT 
AGCCCATCG6 
GTATGACTTT 
CAATGGAAGA 
TAGAGAGACT 
GG G TTGTGTG 
C6CAGCT6GC 
A6GCAACAGA 
AAATCGTGOG 
GCTTGCGGGG 
AAGAGAAATC 
AG6CM3CASC 
TTTATAAT6T 
TTTACCCAGC 
CX33CCACGGA 
AAATGTATGC 
GTOGGATATT 
TGGTGGTGAT 
CAGCAGCACA 
TAACACC3GTT 
GTTTGTTTCT 
AGATA6AGAT 
06CCCM6CT 
AGGTGAGGCA 
TCCTCCTGGA 
TGGGCX3VCXT 



41 

1 

TGGCG0G6TT 
GCOCTGGAAC 
CCCCAGTCCT 
C0GT6AAGAT 
AGCAGCCOGA 
GGATGAG6A6 
GACTACTTCC 
T i XXfmGC T i ' 
CGTOTQGTCT 
GTGGCAAGAA 
GATCTTCTGC 
TGGCTTCAGT 
GTCTAACCT6 
GTCTTGGTOG 
GGCCATCCTA 
CCTGGGTGA6 
06TTGGCAGC 
AATTATTCIG 
AATGATGTTT 
TQAAOGTGTC 
CTGGGTCAAA 
GGAAAAA60C 
TGCCAG06T6 
ggctttcaca 
TTCAGTAAAG 
AATGGAA6A8 
GAAAAATGCC 
6ACCCCCAAA 
GCTGCAGCXSC 
CAGTGACCSVG 
G0GCTTACA6 



51 
I 

GTCCTGGAGC 
CTC06CTCA0 
Q6GTATAGAA 
TCCAAGTTCA 
GCCX^AGGGCC 
CATCCCAAQG 
AAACACCASC 
TCTTCTCTGG 
CTGTCCAAGC 
GAGCTGAATG 
CGCACCAGGC 
OGACCAGCCT 
CAGTACAGCT 
CTTGCACTGA 
ACCATGGCAT 
CTCATCAACA 
CTGCTGGCTO 
QGAOCAACAQ 
GCATCA08GC 
CAGAA6ATGA 
GCATTTTCTC 
GGOTACTTGC 
GTGAOCTTCT 
GTGGTX3ACAG 
TCCCTCTCAG 
GTTGACATGA 
ACCTTGQCAT 
ATGAAAAAAB 
ACTGAGCATC 
0G6CCCAGTC 
AGGACACT6C 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1B60 
1920 
1980 
2040 
2100 
2160 
2220 
2280 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 



412 
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ACACCATCGH TCTGGAGATC CAAGA6GGTA AACTGGTTGG AATCT G OBGC AQTGTGGGAA 1920 

GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGGCAGAT GAOGCTTCTA GAGGGCA6CA 1980 

TTGCAATCAG TGGAACCTTC GCTTATGTGG CXCAGCAGGC CTGGATCCTC AATGCTACTC 2040 

TGAGAGACAA CATCCTGTTT GGGAA6GAAT ATGATGAAGA AAGATACAAC TCTGT6CTGA 2100 

ACA6CT6CTG CCTGAGGCCT GACCTGGCCA TTCT7CCCAG CA606ACCTG AOGGAGATTG 2160 

GA6AG06AGG AGOCAAOCTG AGOaGTGGGC AG06GCAGA6 GATCAGCCTT GCCOGOGCCT 2220 

TGTATAGTGA CAGGftGCATC TACATCCTGG AOGACCCCCT CAGTGCCTTA GATGCCCATG 2260 

TGGGCAACCA CATCTTCAAT AOTGCTATCC GGAAACATCT CAAGTCCAAQ ACAGrrCTGT 2340 

TTGTTACCCA CCAGTTACAO TACCTGGTTG ACTGTGATGA AC?rGATCTTC ATGAAAGAGG 2400 

6CTGTATTAC GGAAAGAG6C ACCCATGAGG AACTGATGAA TTTAAATG6T GACTAT6CTA 2460 

OCMTTTTAA TAACCTGTTO CI06GAGAGA CAOOGCCAQT T6A6ATCAAT TCAAAAAAOG 2520 

AAACCAGTGG TTCACAGAAG AAGTCACAAG ACAAGGGTCC TAAAACAGGA TCAGTAAAGA 25 BO 

AGGAAAAAGC AGTAAAGCCA GAGGAAGGGC AGCTTGTGCA GCTGGAAGAO AAAGGGCAGG 2640 

GTTCAGTGCC CTGGTCAGTA TATGGTGTCT ACATCCAGGC TGCTGGGGGC CCCTTGGCAT 2700 

TCCTGGTTAT TATCGCCCTT TTCATGCTGA ATGTAGGCAG CACCGCCTTC AGC31CCTGGT 2760 

GGTTGAGTTA CTGGATCAAG CAAG6AAG0Q G6AACACCAC TGTGACTOGA 6GGAAC3QAGA 2820 

CCTGGGTGAG TGACAGCATG AAGGACAATC CTCATATGCA GTACTATGOC A6CATCTA0G 2860 

COCTCTCCAT GGCAGTC3^TG CTGATCCTGA AAGCCATTOO AGGAGTTGTC TTTGTCAAGG 2940 

GCACGCTGCG AGCTTCCTCC CGGCT6CATG AOGAGCTTTT COGAACGATC CTTC3GAAGCC 3000 

CTATGAAGTT TTTTGACACG ACCCCCACAG QGAGGATTCT CAACAGGTTT TCCAAAGACA 3060 

TQGATGAAGT TGACGTGOSG CT0CCX3TTCC AGGCCGAGAT GTTCATCCAC AACGTTATCC 3120 

TX3GTGTTCTT CTGTGTGGGA ATGATOGCAG GAGTCTTCCC GTGGTTCCTT GTGGCAGTGG 3180 

6GCCCCTTGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGGGTCCTG ATTOQGGAGC 3240 

TGAAGCGTCT GGACAATATC ACGCAGTCAC CTrrCCTCTC CCACATCACG TCCAGCATAC 3300 

AGGGCCTTGC CACCATCCAC GCCTACAATA AAGGGCAGGA OTTTCTGavC AGATACCAGG 3360 

AGCTGCTX3GA TCACAACCAA GCTCCTTTTT TTTTGTTTAC GTGTGOGATG CGGTGGCTGG 3420 

CTGTGOSGCP GGACCTCATC AGCATCGCCC TCATCACCAC CACGGGGCTG ATGATOGTTC 3480 

TTATGCACGG GCAGATTCCC CCAGCCTATG 06GGTCT0GC CATCTCTTAT GCTGTCCAGT 3540 

TAAOGGGGCT 6TTCCAGTTT AGGGTCAGAC TGGCATCTGA GACAGAAGCT C6ATTCACCT 3600 

OGGTGGAGAG GATCAATCAC TACATTAAGA CTCTGTCCTT GGAAGCACCT GCCAGAATTA 3660 

AGAACAAGGC TCCCTCCCXTT GACTG6CCCC AGGAGGGACS^ GGTGACCTTT GAGAAOGCAG 3720 

AGATGAGGTA CXXMOAAAAC CTCCCTCTTG TCCTAAAGAA AGTATCCTTC ACGATCAAAC 3780 

CTAAAGAGAA QATT66CATT GTGGGG0G6A CA6GATCA00 GAA6TCCTC0 CTGQGGATGG 3840 

OOCTCITCOO TCTGGTGGAG TTATCTGGAG GCT6CATCAA GATTGATGGA 6TGAGAATCA 3900 

GTGATATTGG CCTTGCCGAC CTCOSAAGCA AACTCTCTAT CATTCCTCAA GAGCOOGTGC 3960 

TGTTCAGTGG CACTGTCAGA TCAAATTTGG ACCCCTTCAA CCAGTACACT GAAGACX»GA 4020 

TTTGGGATGC CCTGGAGAGG ACACACATGA AAGAATGTAT TGCTCAGCTA CCTCTGAAAC 4080 

TTGAATCIGA ASTGATQGAa AATOGGOATA ACTTCTCAGT OOGGGAAOSS CAGCTCTTGT 4140 

GCATAGCTAG AOOCCTGCTC OSCCACTGTA AGATTCTGAT TTTAOATSAA G0CACA6CTG 4200 

CCATGGACAC A6AGACAGAC TTATTGATTC AAGAGAOCAT CC6AGAAGCA TTTGCAGACT 4260 

GTACCATGCT GACCATTGOC CATOGCCTGC ACAOGGTTCT AGGCTCCGAT AGGATTATGG 4320 

TGCTGGCCCA GGGACAGGTG GTGGAGTTTG ACACCCCATC OGTCCTTCTG TCCAACGACA 4380 

GTTCC06ATT CTAT6CCATG TTTGCTGCTQ CAGAGAACAA GGTGGCTGTC AA GGG CT G AC 4440 

T CCTCCCTGT TOIOGAAGTC TCTTTTCTTT AGAGCATTGC CATTC0CT6C CTG G GGOGGG 4500 

COCCTCATCG CGTCCTCCTA CCGAAACXTTT GCCTTTCTOS ATTTTATCTT TCGCACAGCA 4560 

GTTCCGGATT GGCTTGTGTG TTTCACTTTT AGGGAGAGTC ATATTTTGAT TATTGTATTT 4620 

ATTCCATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 

GGGAAGCQTT AITATAATTO TATCAGAGGC CTATAATGAA OCTTTATAOS TGTA GCTATA 4740 

TCTATATATA ATTCTGTACA TAGCCTATAT TTACAGTGAA AATGTAA6CT GTTTATTTTA 4600 

TATTAAAATA AGCACTGTGC TAATAACAGT GCATATTCCT TTCTATCATT TTTGTACAGT 4860 

TTGCTGTACT AGAGATCTGG TTTTGCTATT AGACTGTAGG AAGAGTAGCA TTTGArrCTT 4920 

CTCTAGCTGG TGGTTTCAC6 GTGCCAGGTT TTCT6GGTGT CCAAAGGAAO ACGTGTGGCA 4980 

ATAGTGGGCC CTCOQACAOC OCCCTCTGGC QCCTCOCCAC AGCOGCTCCA GGGGTGGCTO 5040 

GAGAOGGGTQ GG03GCTGQA GACCAT6CAG AG0GC0GT6A GTTCTCAGG6 CTCXrTGOCTT 5100 

CTGTCCTGGT OTCACTTACT GTTTCTGTCA GGAGAGCAGC GGGGOGAAGC CCAGGCCCXTT 5160 

TTTCACTCCC TCCATCAAGA ATGGGGATCA CAGAGACATT CCTCCGAGCC GGGGAGTTTC 5220 

TTTCCTGCCT TCTTCTTTTT GCTGTTGTTT CTAAACAAGA ATCAGTCTAT CCACAGAGA6 5280 

T0CCACT6CC TCAGGTTOCT ATOGCTGGCC ACTGCACAGA GCTCTCCAGC TCCAAGACCT 5340 

GT l X ayr rOC A AGCCCTGQAO CCAACTOCTG CrrPTTGAOG TGGCACTTTT TCATTTGCCT S400 

ATTCCCACAC CTCCACAGTT CAGTGGCAGG GCTCAGGATT TCGTGGGTCT i^VrnVClTt 5460 

CTCACCGCAG TCGTCGCACA GTCTCTCTCT CTCTCTCCCC TCAAAGTCTG CAACTTTAAO 5520 

CAGCTCTTGC TAATCAGTGT CTCACACTGG CGTAGAAGTT TTTGTACTGT AAAGAGACCT 5580 

AOCrCAOGTT GCX08T1GCT tfit nW l'rXXi GTOTSTTCOC Q CAAA OOCOC TTTGTOCIGT 5640 

G60GCT66TA GCTCAGGTGG GOGTGaTCAC T6CTQTCATC AGTTGAATGG TCAOOGTTGC 5700 

ATGT06TGAC CAACTAQACA TTCTGTOGCC TTAGCATGTT TGCTGAACAC CTT8TG6AAG 5760 

CAAAAATCTG AAAATGTGAA TAAAATTATT TT6GATTTT6 TAAAAAAAAA AAAAAAAAAA 5820 
AAAAAAAAAA AAAAAAAA 

Seq ID NOt 585 Protein sequence 
Protein Accession S: NP_005679.1 

1 11 21 31 41 51 

I i t I I I 

MKDIDIGKEV ZZPSPGVRSV RERTST56TO RDREDSKFRR TRPbBOQDAL BTAARABSLS 60 

IiDASMHSQLR ILDEEHPKGK YHHGLSAUCP IRTTSKEQHP VDNAGLFSCM TPSWLSSLAR 120 

VAHKKGELSM EDVHSLSKHE SSDVNCRRI^ RLHQEEU7EV GFZ3AA5LRRV VWIPCRTRLI 180 

LSZVdiMITQ LAGFSGPAFM VKHLLEYTQA TEStTLQYSLL LVLGLLLTBI VRSMSLALTW 240 

AWYHT6VRL RGAILTNAFR KILKLKHIKE KSLGELZNZC SNZX3QRMFBA AAVOSLLAGO 300 

PWAILGHZY NVIILGPTGF I/3SAVFIIiFY PAMMEASRLT AYFRRKCVAA TDERVQKMNB 360 

VLTYIKPIKM YAMVKAPSQS VQKIREBERR ILEKAGYPOG ITVGVAPIW VIASWTPSV 420 

HMTLGFDLTA AQAPTWTVF NSHTFALKVT PPSVKSLSEA SVAVDRPKSL FU4EEVHMIK 480 

HKPASPHIKI EKKKATLAWD SSBSSIQNSP KLTPXMKKDK RA5RQKKBRV RQLQRTEHQA 540 

VIAEQKiSLL LDSDERPSFB EEEX5KHIHLG HLIUUQRTIAS ZDLBIQBGKL VQZ08SVGSQ 600 

KTSLISAILG QMTLIiEGSIA ISGTFAYVAQ QAWZLSATLR DHZLFGKSyD BBRYNSVLH9 660 

CCLRPDLAZL PSSDLTEIGB RGANLSGGQR QRISLARALY SDRSIYIU)D PLSAIJ)AHV6 720 

NHZFNSAIRK HLKSKTVLPV THQLQYLVDC DEVIFKKBGC ZTERGTHEBL MNLNGDYATZ 780 

PMNI.LLGBTP PVEZNSKRBT SGSQKKSQDK GPKT65VKKB KAVKPSSGQL VQLEBKGQGS 840 
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VPWSVyGVYX QAAGCSPLAFL VIMALFKUIV GSTAFSTHWL 
VSDSMKDHFH KQYYASIYAL SMAVKLIUA ZR6VVFVK6T 
KPPDTTPTGH ILNRPSKDMD EVOVRLPFQA EKFIQNVILV 
LVILFSVLHI VSRVLIRELK RLONITQSPP LSHITSSIQG 
LDDKQAPFFL FTCAMRWLAV RLDX.ISIALI TTTGLMZVLM 
GLFQFTVRIA SETBARFTSV ERmHYIKTL SLEAPARXKS 
RYRaniPZiVL RRVSFTIKPR EKIGZV6RT0 SGKSSIiratAL 
IGLADLRSKL SIIPQEFVLP SGTVHSmJ^P FNQYTEDQIW 
SBVMEKGDNF SVGERQLZ^CI ARALLRECKI XiXU3EATAAM 
NLTIABRLHT VLGSDRIMVL AQGQWEFDT PSVhLSSDSS 

Seq ID MO: 586 DNA sequence 

Nucleic Acid Accession #t 1IN_001327.1 

Coding sequence I 89.. 631 



PCTAJS02/12476 



SYWIlCQGSai 



PPCVOCIAGV 
LATIHAYKKG 
HGQZPPAYAQ 
KAPSEDHPQB 
PRLVEXtSGGC 
DALERTHMKE 
DTBTDIiLXQB 
RPYAMPAAAE 



1 
I 

AGCAGGGGGC 
CTGAGAGCCG 
GAC3GQGCGAT 
TGG06GCCCA 
AAGG6CCTC0 

gctgaatgga 
cx:to6Ccatg 

GGATGCCCCA 
CATA CTGACT 
CTGTCTCCAG 
6GCTCAGCCT 
GCCPCCrCCC 
GTTTGTCGCT 



11 

I 

GCTGTGTGTA 
GGCAGAGGCT 
GCTGATGGCC 
6GA6AGGCG6 
G6GC06GGAG 
TGCTGCAGAT 
CCTTTOGOGA 
CCGCTTCCCG 
ATCC3GACTGA 
CAGCTTTCCC 
OCCTCAOOGC 
CTAGGGAATG 
GGAGGAGGAC 



21 
I 

COQAQAATAC 
CX3GGAGCCAT 
CAGGAGGCCC 
GTGCCA0GG6 
GAGGCGCCCC 
GCSGGGGCCAG 
CACCCATGGA 
TGCCAGGGGT 
CTGCTGCAGA 
TCjTTGATGTG 
AGA06C6CTA 
GTCCCAGCAC 
GGCTTACATO 



31 
I 

GAGAATACXrr 
GCAGGCOGAA 
TGGCATTCCT 
OGGCAGAGGT 
G0C6GGTC06 
G6GGCCGGAG 
AGCAQAOCTG 
GCTTCTQAAG 
CCACCGCCAA 
GATCAOGCAG 
AGCCCAGCCT 
GAGTGGCCA6 
TTTGTTTCTG 




CATGG06G0Q 
AGCC6CCTGC 
GCCC6CAGGA 
GAGTTCACTG 
CTGCAGCTCT 
TGCTTTCTOC 
GGC6CXXXTT 
TTCATTGTGQ 
TAGAAAATAA 



TTVTRGHETS 
LFRRILRSPM 
PPWFLVAVGP 
QKPliHRYQEL 
LAISYAVQLT 
(ZBVTFENABM 
XKZDGVRXSD 
CXAQLPLXLB 
TXREAFADCT 
NKVAVKG 



51 
I 

GACCTTCTCT 
CAGGGGGTTC 
GGGGCAATGC 
CAGGGGCAGC 
CGGCTTCAGG 
TTGAGTTCTA 
GCCTGGCCCA 
TGTCOGGCAA 
CCATC3VGCTC 
CCGTGTTTTT 
CCTAOGTCAT 
GGGCCTGATT 
AACTGA6CTA 



900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



Seq XD NO: 567 Protein sequence 
Protein Accession 8: NP_001318.1 

11 



51 



1 11 21 31 41 

i I I I I ) 

MQAE6RGT60 STGDAD6PGG PGXPDGPG6N AGGPGBAGAT GGR6PRGAQA ARASGPGGGA 
PRGFKG6AAS 8UIGCCR06A R6PESRLLEF YLAMPFATPM EAEIARRSIA QDAPPLFVPG 
VLLKEFTVSG KXLTXRLTAA DHRQLQLSXS SCLQQLSLLM HZTQCFLPVP LAQPPSGQRR 

Seq XD NQt 588 UNA sequence 

Nucleic Acid Accession §: Eos sequence 

Coding sequence I 52.. 459 



1 

i 

CCTOGTGGGC 
GAAGGCCAGO 
CCTGATGGOC 
0GTCCCCG6G 
CCGCATGGCG 
GACAGCOGCC 
ATCAOCTCCT 
GTGTTTTTGO 
TAGGTCATGC 
GCCTGATTGT 
CT6AGCTA 



11 
I 

CCTGACCTTC 
GCACM3GGGG 
CAGGGG6CAA 
G0GCAGGG6C 
GTGCOGCTTC 
T6CTTCAGTT 
GTCTCCAGCA 
CTCAOGCTCC 
CTOCTCCCCT 
TTGT0QCT60 



21 
I 

TCTCTGA6A0 
TTCOAOGGGC 
TGCTGGC66C 
AGCAAGG6CC 
TGCGCAGOAT 
C0QACT6ACT 
GCTTTCCCTG 
CTCA6QGCAG 
AGG6AAT6GT 
A0GA66AGGG 



31 
I 

CCGGOCAGAG 
OATGCTGATG 
CCAGGAGAG6 
T0GGGGCC6A 
GGAA6GTGCC 
GCIGCAGACC 
TTGATGT6GA 
AiQGGGCTAAO 
CCCAOCACGA 
CTTACATOTT 



41 

I 

6CTC0GGAGC 
GCCGAGGAGG 
C6GGTGCCAC 
GAGGA60C6C 
CCT60GGGGC 
ACOSCCAACT 
TCACGCAGTG 
CCCAGCCTOQ 
GT600CAGTT 
TGTTTCTGTA 



51 
1 

CA7GCAGGCC 
CCCTGQCATT 
GGGCOGGAOA 
CC0G06GGGT 
CAGGA6GCCG 
GCAGCTCTCC 
CTTTCTGCOC 
CGGCCCTTOC 
CATT6T6G6G 
GAAAATAAAG 



Seq ZD NO: 589 Protein sequence 
Protein Accession #: Bos sequence 



41 



51 



I 11 21 31 

I 1 1 I I I 

HQABGQGTGO STQDAXX5F6G PGXPDGPGGN AGGPGEAGAT GGRGPRGAGA ARASQPRGGA 
PRGPHGGAAS AQSGRCPCGA RRPDSRLLQF RLTAADBRQL QLSISSCLQQ LSIiLMHXTQC 
FXiPVFLAQAP SGQRR 

Seq XD MOt 590 SNA sequence 
Nucleic Acid Accession S: NM_005562.1 
Coding sequence : 90 . . 3671 



1 
I 

ACAGCG6AGC 
AOACAQAGAC 
GCTTCT03CT 
ATGGGAAGTC 
TCOGCTGOCT 
GCTTTTACOG 
CTCTTAGTGC 
CCAGATGOGA 
ACCAGAGACT 
A060GGGC0G 
CAGGTTACTA 
6GCATTCAGC 
rrCATCAAGA 
AAT6GTCACA 



11 
I 

GCA6AGTQAO 
TGA6CGGCCC 
CCTCCPSCCC 
CAG6CAGT6T 
CAACT6CAAT 
GCACAQAGAA 
TOGATGTGAC 
C0GATGTCT6 
GCTA6ACTCC 
CTGT6TCTGC 
TAATCTGGAT 
CA6CTGC0GC 
TGTTGATGGC 
G06CCATCAA 



21 
I 

AACCACCAAC 
GGCACOGCCA 
6CAGCXXX3GG 
ATCTTTGATC 
GACAACACTO 
AGOGACCGCT 
AACTCTGGAC 
CCAGGCTTCC 
AAGTGTGACT 
AAGCCAQCTG 
GGQGGGAAOC 
AOCTCTGCAG 
TGGAAGGCTG 
GAT6TGTTTA 



31 
I 

CGA6606003 
TGCCTGCGCr 
OCAOCTCCAG 
GGGAACTTCA 
ATGGCATTCA 
GTTTGCCCTG 
GGTGCAGCTG 
ACATGCTCAC 
GTQACOCAGC 
TTACTGGAGA 
CTGAGG6CTG 
AATACAGTCT 
TCCAAOGAAA 
6CTCAGC0CA 



41 

I 

06CAGC6ACC 
CTGGCTGGGC 
GAGGGAAGTC 
CAGACAAACr 
CTQCSAfiAAG 
CAATTGTAAC 
TAAACCAGGT 
GGATGOC3G6G 
TSGCATCGCA 
AOGCtGTGAT 
TACCO^GTGT 
CCATAAGATC 
TGGGTCTCCT 
AOGACTAGAC 



51 
I 

CCTGCAGOGG 
TGCTGCCTCT 
TGTGATTGCA 
G6TAATGGAT 
TGCAAGAATG 
TCCAAAC3GTT 
GTGACAGGAG 
TGCACCCAAO 
GGGCCCTGTG 
AGGTGT06AT 
TTCTGCTATQ 
ACCTCTACCT 
GCAAAGCTCC 
CCTGTCTATT 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



60 
120 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
760 
840 
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TTOTOGCTCC TGCCAAATTT CTTOGGMVTC AACAGGT6AG CTATOGGCftA ASCCTGTCCT 900 

TTGJVCTACOO TGTGGACAGA G6AGGCA6AC ACCATCTGC COVTGATGTO ATTCTGGAAG 960 

GTGCTGGTCT ACX3GATCACA GCTCCCTTQA TGCCACTTGG CAAGACACTG CCTTGTGGGC 1020 

TCACCAAGAC TTACACATTC AGGTTAAATG AGCATCCAAG CAATAATTGG AGCCCCCAGC 1080 

TGAGTTACTT TGAGTATOGA AGGTTACTGC GGAATCTCAC ACCCCTCOaC ATCCGAGCTA 1140 

CATA1GGAGA ATACAG7ACT GGGTACATTO ACAATGTGAC C3CTGATTTGA GCCOGCCCTG 1200 

TCTCTG6AGC CCCAGCACCC TGG6TTGAAC AGTGTATATO TCCraTTGGO TACA AGGGG C 1260 

'AATTCTGCCA GGATTGTGCT TCTGGCTACA AGAGAGATTC AGOGAOACTG GGGCCTTTTG 1320 

GCACCTGTAT TCCTTGTAAC TGTCAAGGGG GAGGGGCCTG TGATCCAGAC ACAGGAGATT 1380 

GTTATTCAGG GGATGAGAAT CCTGACATTG AGTGTGCTGA CTGCCCAATT GGTTTCTACA 1440 

A0QATC06CA CGACCOC06C AGCTGCAAGC CATGTOOCTG TCATAAOBGO TTCAGCT6CT 1500 

CAGT6ATGCC 06A6ACOGAG GA CG TGGTGT GCAATAACTG CCCTCCCGGG GTCACX^GGTG 1560 

CCCGCTGTGA GCTCTGTGCT GATGGCTACT TTGGGGACCC CTTTGGTGAA CATGGCCCAG 1620 

TGAGGCCTTG TCAGCCCTGT CAATQCAACA ACAATGTGGA CCCCAGTGCC TCTGGQAATT 1680 

GTGACOGGCT GACAG6CAGQ TGTTT6AAGT GTATCCACAA CACAGCC66C ATCTACTG06 1740 

ACC3tGTGC3Ul AGCM3QCTAC TTOSGGGACC CATTOGCTGC CAACCCAGCA GACAAGTGTC 1800 

GAGCTT6CAA CTGTAACCCC ATGGGCTCAG AG0CTGTAG6 ATGTOGAAGT GATGGQlOCT 1B60 

GTGTTTGCAA GCCRGGATTT GGTGGCCCCA ACTGTGAGCA TOGAGCATTC AGCTGTCCAG 1920 

CTTGCTATAA TCAAGTGAAG ATTCAGATGG ATCAQTTTAT GCAOGAGCTT CAGAGAATGG 1980 

AGGCCCTGAT TTCAAAGGCT CAGGGTG6TG ATGGAGTAGT ACCTGATACA GAGCTtSGAAG 2040 

GCAGGATGCA GCAGGCTGAG CAGGCCCTTC A3GACATTCT GAGA0AT60C CAGATTTCAS 2100 

AAGGTGCTAG CAGATCCCTT GGTCTCCAGT TGGCCAAGGT GAG6AG0CAA GAGAACAGCT 2160 

ACCAGAGCOG CCTGGATGAC CTCAAGATGA CTGTGGAAAG AGTTCGOGCT CTGGGAAGTC 2220 

AGTACCAGAA CCXSAGTTOGG GATACTCACA GGCTCATCAC TCAGATOCAG CTGAGCCTGG 2280 

CAGAAAGTGA AGCTTCCTTG GGAAACACTA ACATTCCPGC CTCAGACCAC TAC3GTGGGGC 2340 

CAAATGGCTT TAAAAGTCTG GCTGAGGAGG CCACAAGATT AGCA6AAAGC CAOGTTQAGT 2400 

CAGCCAGTAA CATGGAGCAA CTGACAAGGG AAACTGAGGA CTATTGCAAA CAAOCCCTCT 2460 

CACTGGTGaS CAA6GCCCT6 CATGAAGGAG TOGGAAGOGG AAGOGGTAGC CCGGAOGGTG 2520 

CTGTGGTGCA AGGGCTTGTG GAAAAATTGG AGAAAACX3UI GTCCCTOGCC CAGCAGTTGA 2580 

CAAGGGAGGC CACTCAAGCG GAAATTGAAG CAGATAGGTC TTATCAGCAC AGTCTCOGCC 2640 

TCCTGGATTC AGTGTCTCGG CTTCAQGGAG TCAGTGATCA GTCCTTTCAG GTG6AAGAAG 2700 

CAAAGAG6AT CAAACAAAAA GOGGATTCAC TCTCAACGCT GGTAACCAGG CATATGGATG 2760 

AGTTCAAGGa TACACAAAAS AATCTG6GAA ACTGGAAAGA AGAAGCACAG CAGCTCTTAC 2820 

AGAATG6AAA AAGTGGGAGA GAGAAATCAG ATCAGCTGCT TTCC06TGCC AA7CTTGCTA 2880 

AAAGCAGAGC ACAAGAAGCA CTGAGTATGG GCAATGCCAC TTTTTATGAA GTTGAGAGCA 2940 

TCCTTAAAAA CCTCAGAGAG TTTGACCTGC AGGTGGACAA CAGAAAAGCA GAAGCTGAAG 3000 

AAGCCAT6AA GAGACTCTCC TACATCAGCC AGAAGGTTTC AGATGCCAGT GACAAGACCC 3O60 

AGCAAGCAGA AAGAGCCCT6 GOGAGGGCTG CTGCTGATGC ACAGAGGGCA AAGAATGGGG 3120 

OCGGGGAGGC CCTGGAAATC TCCAGTGAGA TTGAACAGGA GATTGGGAGT CTGAACTTGG 3180 

AA6CCAATGT GACAGCAGAT GGAGCCTTGG CCATGGAAAA GQGACTOQCC TCTCTGAAGA 3240 

GTGAGATGAG GGAACTGGAA GGAGAGCTGG AAAGGAAGGA GCTGGAGTTT GACACX5AATA 3300 

TGGATGCAGT ACAGATGGTG ATTACAGAAG CCCAGAAGGT TGATACCAGA GCCAAGAACG 3360 

CTGQG6TTAC AATGCAAQAC ACACTCAACA CATTAGACGG OCIGCTGCAT CTGATGGACC 3420 

AGCCTCrrCAG TGTAGATGAA GAGGbOCTGG TCTTACTG6A 0CA6AA6CTT TOXGAGCCA 3480 

AGACXX3«SAT CAACAGCCAA CTGCG6CCCA TGATGTOVGA OCTGGAAGAG AGGGCACGTC 3540 

AGCAGAGGGG OCACCTCCAT TTGCTGGAGA CAAGCATAGA TGOGATTCTG GCTGATGTGA 3600 

AGAACTTGGA GAACATTAGG GACAACCTGC CCCCAGGCTG CTACAATACC CAGGCTCTTG 3660 

AOCAACAGTG AAGCTQOCAT AAATATTTCT CAACTGAOGT TCTTGGGATA CAGATCTCAG 3720 

GGCTCGGGAG CCATGTCATG TGAGT GGG T G G6 A TG GG GAC ATTTGAACAT 6TTTAATGGG 3780 

TATOCTCAGQ TCAACTGACC TGA0CXX3VTT CCTGATCCCA TGGCCAGGTQ GTTGTCTTAT 3840 

TGCACCATAC TCCTTGCTTC CTGATGCTGG GCAATGA6GC AGATAGCACT GGGTGTGAGA 3900 

ATGATCAAGG ATCTGGACXX: CAAAGAATAG ACTGGATGGA AAGACAAACT GCACAGGCAG 3960 

ATGTTTGCCT CATAATAGTC GTAAGTGGAG TCCTGQAATT TG6ACAAGTQ CIGTTGGQAT 4020 

ATAGTCAACT TATTCTTTGA GTAATGTGAC TAAAGGAAAA AACTTT6ACT TTOCCCWQOC 4080 

ATQAAATTCT TCCTAATOTC AGAACAGAGT GCAACCCAGT CACACTGTGG CCAGTAAAAT 4140 

ACTATTGCCT CATATTGTCC TCTGC3UVGCT TCTTGCTGAT GAGAGTTCCT CCTACTTACA 4200 

ACCCAGG6TG TGAACATGTT CTCCATTTTC AAGCTGQAAG AA6TGAGCAG TGTTGGAGTG 4260 

AGGACCIGTA AG6CA06CGC ATTCAGAGCT ATGGTGCTTG CTGGTGOCTG CCACCTTCAA 4320 

GTTCT66ACC TGGGCATGAC ATCCTTTCTT TTAATGATGC CATGOCAACT TA GASAT TGC 4380 

ATTTTTATTA AAOCATTTCC TACCAGCAAA GCAAAT6TTG GGAAAGTATT TACTTTTTCO 4440 

GTTTCAAAOT GATAGAAAAC TGTGGCTTGG GCATTGAAAG AGGTAAAATT CTCTAGATTT 4500 

ATTAGTCCTA ATTCAATCCT A C TTTT O GAA CACCAAAAAT GATOOSCATC AATGTATTTT 4560 

ATCTTATTTT GTCAATCTCC TCTCTCTTTC CTCCAOOCAT AATAAGA6AA TOTTOCTACT 4620 

CACACTTCA6 CTGGGTCACA TCCATCCCTC CATTCATCCT TCCATCCATC TTTOCATCCA 4680 

TTAOCTCCAT CCATCCTTCC AAO^TATATT TATTGAGTAC CTACTGTGTO CCAGGGGCTG 4740 

GTGGGACAGT GOTGACATAC TCTCTGCCCT CATAGAGTTG ATTGTCTAGT GAGGAAGACA 4800 

AGCATT7TTA AAAAATAAAT TTAAACTTAC AAACTTTGTT TGTCACAAOT GGTGTTTATT 4860 

GCAATAA008 CTTGGITTOC AACCTCTTTG CTCAACAOAA CATATGnOC AAGAC CCTC C 4920 

CATGGGGGCA CTTGAGTTTT GGCAAOQCTG ACAGA6CTCT G66TTGTGCA CATTTCTTTG 4980 

CATTCCAGCT GTCACTCTGT GCCTTTCTAC AACTGATTGC AACAGACTGT TGAGTTATGA 5040 

TAACACCAGT GGGAATTGCT GGAGGAACCA GAGGCACTTC CACCTTGGCT GGGAAGACTA 5100 

TGOXXKrrGOC TTGCTTCTGT ATTTCCTTGG ATTTTCCTGA AAGTGTTTTT AAATAAAGAA 5160 
CAATTGTTAQ ATGCC 

Seq ID NOt 591 Protein sequence 
Protein Accession «: KP_005553.1 

1 11 21 31 41 51 

I I I I I I 

MPALWLGCCL CPSLUiPAAR ATSHREVCDC NGKSRQCIFD RELHRQTGHG FRCLNCKDKT 60 

DGIBCEKC3CK GFTRfiRERDR CLPOJCNSKO SLSARCDKSG RCSCKPGVTG ARCDRCLPGP 120 

HMLTDAGCTQ DQRLLDSKCD CDPAGIAGPC DAGRCVCKPA VT6ERGDRCB S6YYHLD6GN 180 

PBGCTQCPCy GHSASCRSSA EYSVHRITST FBQDVDGWKA VQRNGSPAKb QHSQRBQDVF 240 

SSAC^UiDPVY PVAPAKFUai QQVSyGQSItS FDYRVDSGGR HPSABDVZLE GAGIAITAPL 300 

MPLGKTLPCG LTKTyTPRLM EHPSNNWSPO LSYPBYRBLL RSLTALRIHA TYGBySTGYI 360 

DNVTLISARP VSGAPAPWVE QCZCPVGYKG QPOODCASGY KRDSARLGPP GTCIPQIOQG 420 

6GAa)PDT(3} CYSGDGKPOZ ECADCPIGFY NDPEDPRSCK PCPCBNGPSC SVMFHTEEW 480 



415 
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C31HCPP6VT0 ARCBLGAOGY P GP PP GmGP VRPOQPOQOI HSVDPSASQI CDRLT6RCLK 540 

CXHtrrAGZYC DQGKAGYFGD PZASNPADKC RACMCHFNGS KW GCJtSDGT CVCKPGFGGP 600 
NCEHGAFSCP ACVKOVKIOM DQFMQQLQRM EALISKAQG6 DGWFDTELB 6RMQQAEQAL 660 

QDILR22AQXS BGASRSLGLQ LAXVRSQENS YQSRLDDLKM TVERVRALG5 QYQNKVHDTH 720 

RLITCJKQXiSL AESEASLGNT NIPASDBYVG FKQFKSLAQB ATRLAESHVB SASNMBQLTR 780 

ETBDYSKQAI. SLVRXAIflBQ VGSQSGSPOO AWQOLVEKIi BKTRSLAQQL TRBATQAEIE 840 

ADRSYQRSLR LIiDSVSRLQG VSDQSFOVEB AKRIKQXADS LSTLVTRRMD EFKRTQKKLG 900 
NWKEEAQQLL QN6KS6RSKS DQLLSRANLA KSRAQEALSM GMATFYEVES ILKNLREFDL 960 

QfVDlTRKAEAE EAMKRLSYIS QKVSDASDKT QQAERALGSA AADAQRAKNG AGEALEISSB 1020 

IBQSZGSLNL EANVTAD6AL AMEKGLASLK SEMRBVEGEL ERKSLSF177N MDAVQNVITE 1080 

AQXVDTIUUQI AGVTXQDTUI TLD6LLRLHD QPLSVDEEGL VLLEQKLSRA KTQINSQLRP 1140 
MMSELEERAR gQRGHiaiiLB TSZDGZLADV KHLSURDm. PPGC»m)AL BQQ 

Seq ID NO: 592 DNA sequence 
Nucleic Acid Accession Ui AP101051.1 
Goding sequence t 321. 85 6 

1 11 21 31 41 51 

I I i 1 1 1 

GAGCAACCTC AfiCTTCTAGT ATOCAGACTC CAGOGCCDGCC CCGGGO60G6 ACCCCAACCC 60 

OGACOCAQAG CTTCTCCAGC GGCXXSGGCAO G6AGCAGGGC TCCCX38CCTT AACTTOCTCC 120 

GCGGGGCCCa GCCACCTTCG GGAGTC0G06 TTGCCCACCT GCAAACTCTC CGCCTTCTGC 180 

ACCTGCCACC CCTGAGCCAQ CGOGGGCGCC C36AGCGAGTC ATGGCCAAOG CGGGGCTGCA 240 

GCTGTTGGGC TTCATTCTCG CCTTCCTGGG ATGGATOGGC GCCATCGTCA GCACTGCCCT 300 

GCCCCA6TGG AGGATTTACT CCTATGCG6G CGACAACATC GTGACG6CCC AGGCCATGTA 360 

OGAGGGGCTG TGGATGTCCT GC3GTGTC6CA GA6CACX3G06 CAGATCCAGT GCAAAOTCTT 420 

TGACTCCTT6 CTGAATCTGA GCAGCACATT GCAAGCAACC 0GT6CCTTGA TGGTGGTTGG 480 

CATCCTCCTG GGAGTGATAG CAATCTTTGT GGCCACOGTT GGCATOAAGT GTATGAAGTG 540 

CTTGGAAQAC GATGAGQTGC AQAAGATGAG GATGGCTGTC ATTGGGGGTG CGATATTTCT 600 

TCTTGCAOGT CTGGCTATTT TAGTTGCC3VC AGCATGGTAT GGCAATAGAA TCGTTCAAGA 660 

ATTCTATCSAC CCTATGACCC CAGTCAATGC CAGGTACGAA TTTGGTCAGO CTCTCTTCAC 720 

TGGCTGGGCT GCPGCTTCTC TCTGCCTTCT GGGAGGTGCC CTACTTTGCT GTTCCTGTCC 780 

CCXSAAAAACA ACX^XTTTACX: CAACACX3U«5 GCXXTTATCCA AAACCTGCAC CTTCCAGOSG 840 

6AAAGACTAC GTGTGACACA GAGGCAAAAG GAGAAAATCA TGTTGAAACA AACCX^AAAAT 900 

GQACATTGAO ATACTATCAT TAACATTAGG ACCTTAGAAT TTTGGGTATT GTAATCTGAA 960 

GTATGGTATT ACAAAACAAA CAAACAAACA AAAAACCCAT GTGTTAAAAT ACTCAGTGCT 1020 

AAACATGGCT TAATCTTATT TTATCTTCTT TCCTCAATAT AGGAGGGAAG ATTTTACCAT 1080 

TTGTATTACT GCTTCCCATT GAGtAATCAT ACTCAAATGG GGGAAGGGGT GCTCCTTAAA 1140 

TATATATAGA TATGTATATA TACATGTTTT TCTATTAAAA ATAGACAGTA AAATACTATT 1200 

CTCATTATGT TGATACTAGC ATACTTAAAA TATCTCTAAA ATAGGTAAAT GTATTTAATT 1260 

CCATATTGAT GAAGATGTTT ATTGGTATAT TTTCTTTTTC GTCCTTATAT ACATATGTAA 1320 

CAGTCAAATA TCATTTACTC TTCTTCATTA GCTTTGGGTG CCTTTGCCAC AAGACCTAGC 1380 

CTAATTTACC AAGGATGAAT TCTTTCAATT CTTCATGCGT GCOCTTTTCA TATACTTATT 1440 

TTATTTTTTA CCATAATCTT ATAGCACTTO CATOGTTATT AAG0CCT7AT TTGTTTT6TG 1500 

TTTCATTGGT CTCTATCTCC TOAATCTAAC ACATTTCATA GCCTACATTT TAGTTTCTAA 1560 

AGCCAAGAAO AATTTATTAC AAATCA6AAC TTTGGAGGCA AATCTTTCTG CATGACCAAA 1620 

GTGATAAATT CCTGTTGACC TTCCCACACA ATCCCTGTAC TCTGACXTCAT AOCACTCTTG 16 BO 

TTT6CTTTGA AAATATTTGT OCAATTGAGT AGCTGCATGC TGTTGCCCCA 6GTGTTGTAA 1740 

CACAACTTTA TTGATTOAAT TTTTAAGCTA CTTAITCATA OTTTTATATC OOCCTAAACT 1800 

ACCTTTTTGT TCCCCATTCC TTAATTGTAT TQTTTTCCCA AGT6TAATTA TCATGOGTTT 1860 

TATATCTTCC TAATAAQQTG T6GTCTGTTT GTCTQAACAA AGTGCTAGAC TTTCTGGAGT 1920 

6ATAATCTG0 TGACAAATAT TCTCTCTGTA GCTGTAAGCA AGTCACTTAA TCTTTCXACC 1980 

TCTTTTTTCT ATCTGCCAAA TTGAGAT3UVr GATACTTAAC CAOTTAGAAO AGGTAGTGTG 2040 

AATATTAATT AGTTTATATT ACTCTGATTC TTTGAACAT8 AACTATGCCT ATGTAGTGTC 2100 

TTTATTTGCT CAGCTGGCTG AGACACTGAA QAAOTCACTO AACAAAACCT ACACaOGTAC 2160 

CTTCATGTGA TTCACTGCCT TCCTCTCTCT ACCAOTCTAT TTCCACTGAA CAAAACCTAC 2220 

ACACATACCT TCATGTGGTT CAGTGCCTTC CTCTCTCTAC CAGTCTATTT CX3^CTGAACA 2280 

AAACCTAOOC ACATACCTTC AT6TGGCTCA GTGGCTTCCT CTCTCTACCA GTCTATTTCC 2340 

ATTCTTTCAG CTGTGT CT GA CATGTTTGTG CTCTGTTCCA TT T T A ACAAC TGCTCTTACT 2400 

TTTCCAGTCT GTACAGAATG CTATTTCACT TGAGCAAGAT GATGTATG6A AAQGGTGTTG 2460 

GCACTGGT6T CTGGAGACCT OGATTTQAQT CTTGGTGCTA TCAATCACCG TCTGTGTTTG 2520 

AOCAAGGCAT TTGGCTGCTG TAAGCTTATT GCTTCATCTG TAAGCGGTGG TTTGTAATTC 2580 

CTGATCTTCC CACCTCACAG TGATGrTGTG GG6ATCCAGT GAGATA6AAT ACATGTAAGT 2640 

6TGGTTTTGT AATTTGAAAA GTGCTATACT AAGGGAAAGA ATTGAGGAAT TAACIGCATA 2700 

OSTmGt^lX i rrGCTTTTCA AATOTTTGAA AATAAAAAAA TOTTAAGAAA TGGGTTTCTT 2760 

GCCTTAACCA GTCTCTCAAG TGATGAQACA GTQAAGTAAA ATTQAGTGCA CTAAACXSAAT 2820 

AAGATTCT6A GGAAGTCTTA TCTTCTGCAG TGAGTATGGC CCAATGCTTT CTGT66CIAA 2880 

ACAQATGTAA TGGGAAGAAA TAAAA60CTA OBTGVIX^ G T A AATCCAACA6 CAAGOOAGAT 2940 

TTTTGAATCA TAATAACTCA TAAGGTGCTA TCTCTTCAGT GATQCCCTCA GAGCTCTTGC 3000 

TGTTAOCTQG CAGCTGACGC TGCTAGGATA GTTAGTTTG6 AAATGGTACT TCATAATAAA 3060 

CTACACAAGG AAAGTCAGCC ACX3GTGTCTT ATGAGGAATT- GQACCTAATA AATTTTAGTG 3120 

TGCCTTCCAA ACCTGAGAAT ATATGCTTTT GGAA6TTAAA ATTTAAATGG CTTTTGCCAC 3160 

ATACATAGAT CTTCATQATO TGTGAGTOT A ATTCCATGTG GATATCAGTT ACCAAACATT 3240 

ACAAAAAAAT TTTATGGCCC AAAATGAOCA AOGAAATTGT TACAATAGAA TTTATCCAAT 3300 

TTT6ATCTTT TTATATTCTT CTACCACAOC TGGAAACAGA GCAATAGACA TTTTGG6GTT 3360 

TTATAATGGG AATTTGTATA AAGCATTACT CTTTTTCAAT AAATTOTTTT TTAATTTAAA 3420 
AAAAG6AAAA AAAAAAAAAA AAA 

Seq ID MOt 593 Protein sequence 
Protein Accession ft: AAD16433.1 

1 11 21 31 41 51 

I I I I I I 

MANAGIiQLIiG FXXAFLGWIO AIVSTALPQH RXYSYAGOHI VTAQAMyBGL WKSCVSQSTQ 60 

QIQCKVFDSL LNLSSTLQAT RALMWGILL GVIAIFVATV OIKCMKCIiBD DEVQKMRKAV 120 

IGGAIFZiLAG LAILVATAWY GNRIVQEFYD PMTFVZIARyB FGQALFTGHA AASZjCLLGGA 180 
UiCCSCPRKT TSYPTPRPYP KPAP^SGKDY V 
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Seq ZD HO: 594 DMA sequence 

Kucleic Acid Accession ii KN_006180.1 

Coding sequence: 352.. 2620 

1 11 21 31 41 51 

1 I t I 1 I 

CCCCCATTCG CATCTAACAA GGAATCTGOG CCCCAQAGAG TCCC3GGACGC CGCOGGTOGG 60 

TGCCCXXSOGC GCOGGGCCAT GCAGCGACGG COGCOGCX5GA GCTCCGAGCA GOGGTAGC3GC 120 

CCCCCIGTAA A6CGGTT06C TATGCGGGGft CCACTGT6AA CCCTGCCGCC TGCCGGAACA 180 

CTCT7C6CTC 0GGAOCA6CT CAGCCTCTGA TAAGCTGGAC T086CACG0C GGCAACAAGC 240 

AC06AGGAGT TAAGAGA6CC 6CAAGCGCA6 GGAAGGCCTC CCCGCAOGGG TG6GG6AAAG 300 

0GGCCX50TGC AGC3GC3GGGGA CAGGCACTCG GGCTGGCACT GGCTGCTAGG GATGTOGTCC 360 

TGGATAAGGT GGCATGGACC OGCCATGGOS CGGCTCTGGQ GCTTCTGCTG GCTGGTTGTG 420 

GGCTTCTGGA GGGCCGCTTT C3GCCTGTCCC A0GTCCT6CA AATGC3W5T6C CTCTOGGATC 480 

TGGTGCAGCG ACCCTTCTCC TG6CAT0GT6 GCATTTCOGA GATTOGAGCC TAACAGTGTA 540 

- GATCCTGAGA ACATCACCGA AATTTTCATC 6CAAACXAGA AAAOGTTASA AATCATCAAC 600 

GAAGATGAT6 TTGAAQCTTA TGTGGGACTG AGAAATCTGA CAA1TGTGGA TTCTGGATTA 660 

AAATTT6TG0 CTCATAAAGC ATTTCTGAAA AACAGCAACC TGCAGCACAT CAATTTTACC 720 

CGAAACAAAC TGACGAGTTT GTCTAGGAAA CATTTCCGTC ACCTTGACTT GTCTGAACTG 780 

ATCCTGGTGG GCAATCCATT TACATGCTCC TGTGACATTA TGTGGATCAA GACTCTCCAA 840 

GA6GCTAAAT CCAGTCCAGA CACTCAGGAT TTGTACTGCC TGAATGAAAG CAGCAA6AAT 900 

ATTO0CCT6G CAAACCTGCA GATACCCAAT TGTGGTTTGC CATCTGCAAA TCTGGCOSCA 960 

CCTAACCTCA CTGTGGAG6A AGGAAAGTCT ATCACATTAT CCTC5TAGTGT GGCAGGTGAT 1020 

CCGGrrCCTA ATATGTATTG GGATGTTGGT AACXTPGGTTT CCAAACATAT GAATGAAACA 1080 

AGCCACACAC AGGGCTCCTT AAGGATAACT AACATTTCAT CCGATGACAG TGOGAAGCAG 1140 

ATCTCTTGTG TGGOGGAAAA TCTTGTAGGA GAAGATCAAG ATTCPGTCAA CCTCACTGTG 1200 

CATTTT6CAC CAACTATCAC AOTTCTOGAA TCTCCAACCT CAGACCACCA CT6GTGCATT 1260 

CCATTCACTO TGAAAOGCAA CCCCAAACCA GOGCTTCAGT G6TTCTATAA C6GGGCAATA 1320 

TTGAATGACT CCAAATACAT CTGTACTAAA ATAGATGTTA CCAATCACAC GGAGTACCAC 1380 

GGCTGCCTCC AGCTGGATAA TCCCACTCAC ATGAACAATQ GGGACTACAC TCTAATAGCX: 1440 

AA6AATGAGT ATG6GAAGGA TGAGAAACAG ATTTCTGCTC ACTTCATGGG CTGGCCTGGA 1500 

ATTGACGA3G GTGCAAACCC AAATTATCCT GATGTAAT7T ATGAAOATTA TGGAACT6CA 1560 

GC6AATGACA TCGOGGACAC CAOOAACASA AGTAATGAAA TCOCTTCCAC AGACGTCACT 1620 

GATAAAACOG GTOSGGAACA TCTCTCX5GTC TATGCTQTGG TGGTGATTGC GTCTGTOGTG 1680 

GGATTTTGCC TTTTGGTAAT GCTGTTTCTG CTTAAGTTGO CAAGACACTC CAAGTTTGGC 1740 

AT6AAAGGCC CAGCCTCCX37 TATCAGCAAT GATGATGACT CTGCCAGCCC ACTGCATCAC 1600 

ATCTCCAATG GGAGTAACAC TCCATCTTCT TOGGAAOGTG GCCCAGAT6C TGTCATTATT 1860 

GGAATGACCA A6ATCCCTGT CATTGAAAAT CCCCA6TACT TTGGCATCAC CAACAOTCAG 1920 

CTCAAGCCAG ACACATTTGT TCAGCACATC AAGCGACATA ACATTGTTCT GAAAAGGGAG 1980 

CTAGGOSAAG GAGCCTTTGG AAAAGTGTTC CTAGCT6AAT GCTATAACCT CTGTCCTGAG 2040 

CAGGACAAGA TCTTGGTGGC AGTGAAGACC CTGAAGGATG CCAGTGACAA TGCAOGCAAG 2100 

6ACTTCCACC 6TGAGGCCGA GCTCCTGACC AACCTCCASC ATGAGCACAT OGTOUkGTTC 2160 

TATGGOGTCr GGGTGGAGGO CGACCCOCTC ATCATG6TCT TTGAGTACA7 6AAGCATG60 2220 

GACCTCAAC31 AGTTCCTCAG GGCACAOGGC CCTGATGCOG TGCTGATGGC TGAGGGCAAC 2280 

OOOCCCACGG AACTGAOGCA GTOGCAGATC CTGCATATAG OCXaGCAGAT OGC0GCX3GGC 2340 

ATX3GTCTACC TGGCX3TCCCA 6CACTT05TG CACCGCGATT TGQCCACCA6 GAACTGCCTG 2400 

OTOGGGGAGA ACTTGCTGGT GAAAATGGQG OACTTTOGGA TGTC006GGA 061GTACAGC 2460 

ACTGACTACT AC3U3GGT0 3Q TGGCCACACA ATGCT6CCCA TT06CTGSAT GCCTCCAGAG 2520 

AGCATCATGT ACAGGAAATT CAOGACGGAA AGOGAOGTCT GGAGCCTGGG GGTOGTGTTG 2580 

TGGGAGATTT TCACCTTATOG CAAACAGCCC TGGTACCAGC TGTCAAACAA TGAGGTGATA 2640 

GAGTGTA7CA CTCA6GGC06 AGTCCTGCAG 0GACCXX:6CA GGTGCCCCCA GGASQTGTAT 2700 

GAGCTGATGC TOGGGIGCTQ GCAGOGAOAO CCCCACATQA QGAAGAACAT CAAGQ6CATC 2760 

CATACCCTOC TTCAGAACTT GOCCAAOOCA TCTCeGQTCT A0CT66ACAT TCTAGGCTAG 2820 

GGCCCTTTTC CCCy«3ACCGA TCCTTCXX3UV CGTACTCCTC AGAOGGGCTG AGAGGATGAA 2880 

CATCTTTTAA CTGCCGCTGG AGGCCACCAA GCTGCTCTCC TTCACTCTGA CAGTATTAAC 2940 

ATCAAA6ACT COGAGAAGCT CTCGAGGGAA GCAGTGTGTA CTTCTTCATC CATAGACACA 3000 

QTATT6ACTT CTTTTTGGCA TTATCTCTTT CTCTCTTTOC ATCTCCCTTO GTTOTTCCTT 3060 

rii - emi ' iT taaattttct ■ nTrcri ' crr ttttttcgtc ttccctgctt cacgattctt 3120 

ACCCTTTCTT TTGAATCAAT CTGGCTTCTG CATTACTATT AACTCTGCAT AGACAAAGGC 3180 

CTTAACAAAC GTAATTTGTT ATATCAGCAG ACACTCCAGT TTGCCCACCA CAACTAACAA 3240 

TGCCTTGTTO TATTCCTGCC TTTGATGTGG ATGAAAAAAA GGGAAAACAA ATATTTCACT 3300 

TAAACTTTGT CACTTCT6CT GTACAGATAT G6A6AGTTTC TATGGA7TCA CTTCTATTTA 3360 

TTTATTATTA TTACTGTTCT TATTGTTTTT GGATG6CTTA AGCCTGT6TA TAAAAAA6AA 3420 

AACTTGTGTT CAATCTGTGA AGCCTTTATC TATGGGAGAT TAAAACCAGA GAGAAAGAAO 3480 

ATTTATTATG AACOGCAATA TGQQAGGAAC AAAGACAACC ACTGGGATCA GCTGGTGTCA 3540 

GTCCCTACTT AOGAAATACT CAGCAACTGT TAGCTGGGAA GAATGTATTC GGCACCTTCC 3600 

CCTGAGGAOC TTTCTGA6GA GTAAAAAGAC TACT G GOCTC TGTGCCAT80 ATGATTCTTT 3660 
TCCCATCACC AGAAATGATA GCGTGCAGTA GAQAQCAAAa ATGGCTT 

Seq ID NO: 595 Protein sequence 
Protein Accession fti KP_006171.1 

1 11 21 31 41 51 

) i I i I ) 

MSSWIRHHQP AMARLWGFCW LWGPWRAAF ACPTSOCCSA SRIWCSDPSP GIVAPPRLBP 60 

HSVDPEanTB IFIANQKRIiE IZNEDDVEAY VGLRHLTIVD SGLKFVAHKA FLKKS2IXiQHI 120 

NFTRKKLTSL SHKHFRHU3L SELILVGNPF TCSCDIHirXX TXfiBAKSSFD TQDLYCU7BS 180 

SKtnPLANZiQ IPN06LPSAN LAAFHLTVEE GKSZTIiSCSV AGDPVFNMyW DVGSJLVSKHM 240 

NETSHTQGSL RlTSaSSDDS GKQISCVAEN LV6ED0DSVN IjTVHFAPTIT FLBSPTSDHH 300 

WCIPFTVKGN PKPALQWFYN GAILNESKYI CTKIHVTNHT BYHGOiQLDN PTHMNNGDYT 360 

LIAXHEYGKD EKQISAHFKG WPGIDDGAKP NVPDVIYEDY GTAANDIGDT TKRSKEIPST 420 

DVTDKTGREH LSWAVWIA SWGPCLLVM LPLLKLARHS BCPOflHGPASV ZSHDDDSASP 480 

LHHZSMGSNT P6SSEGGPDA VZIGMTKZPV lENPQYFGIT NSQIiKPDTFV QHZKRHHXVL 540 

KRELGBGAFG KVFLAECYNL CPEQDKXLVA VKTLKDASDN ARKDFHREAB LLTNLQHEHZ 600 

VKFYGVCV1X3 DPLIMVFEm KEGDLNKFLR AHGFDAVLHA BQ2PPTELTQ SQMI£IAQQI 660 

AAiatVYLASQ HFVBRDLATR NCLVGOOiLV KIGDFGMSHD VySTDYYHVG GBTNLPZRNM 720 



417 
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PPESIMXRXP TTSSDVHSU3 WLNBXmO KQPHyQLSMN EV3ECITQG3t VLQRPRTCPQ 780 
BVYELMLGCM QRBPHKRRNl KGIHTLLQin) AKASFVYLDZ LG 

Seq ID NO I 596 TStlA sequence 
Nucleic Acid Accession fi: AF410899 
Coding sequence t 483.. 2999 

1 11 21 31 41 51 

I I I I I I 

GGGAQCACGA GCCT06CTG6 CTGCTTGGCT 06OGCTCTAC 606CTCAGTG CCX3GGC6GTA 60 

GCAGGAOCCT GGAOOCAGGC G OOS G CBGOG G6GCSTGA66C GCOGGAGCCC GGCCTGOAGO 120 

TGCATACCG6 ACCCCCATTC GCATCTAACA A66AATCTGC GCCCC3USAGA GTCC066A08 160 

COGCOGGTOG GTGCCOGGCG 0GCO6GGCCA TGCAGCX3A0G GCCX3CCG0GG A6CTC06A6C 240 

AGOGGTAGOG CCCCCCTGTA AAG0GGTT06 CTATGCOSGG ACCACTGTGA ACCCTGC06C 300 

CTGCXGGAAC ACTCTTOOCT COQQACCAGC TCACOCTCTG ATAAGCTGGA CTCGGCACX3C 360 

COGCAACAAG CACC3QAGGA6 TTAAGA6AGC (3XAAG0GCA GGGAAGGCCT CCCO6CA00G 420 

GTGGGGGAAA GOQGCCGGTG CAGCGCGGGG ACA06CACTC GGGCTG6CAC TG6CTGCTAG 480 

GGATGTCGTC CTGGATAAGG TGGCATGGAC CCGCCATGGC GOGGCTCTGG GGCTTCTGCT 540 

GGCTGGTTGT GGGCTTCTGG AGGGCOGCTT TCGCCTGTCC CAOGTCCTGC AAATGCAGTG 600 

CCTCTCGGAT CTGGTGCAGC GACCCTTCTC CTG6CAT0GT 6GCATTTCCG AGATTGGAGC 660 

CTAACAGTGT AGATCCTGAG AACATCACOG AAATTTTCAT 06CAAAOCA6 AAAAG6TTAG 720 

AAATCATCAA 0GAAGATGA7 GTTGAAGCTT ATGTGGGACT GAGAAATCTG ACAATTGTGG 780 

ATTCTGGATT AAAATTTGTG GCTCATAAAG CATTTCTGAA AAACAGCAAC CTGCAGCACA 840 

TCAATTTTAC COGAAACAAA CTGAaSAGTT TGTCTAGGAA ACATTTCCGT CACCTTGACT 900 

TGTCTGAACT GATCCTGGTG GGCAATCXAT- TTACATGCTC CTGTGACATT ATGTGGATCA 960 

AGACTCTCCA AGASGCTAAA TCCAGTCCAG ACACTCAGGA rPPGTACTGC CTGAATGAAA 1020 

GCA6CAAGAA TATTCCCCTG GCAAACCTGC ASATACCCAA TTGTGGTTTG CCATCTGCRA 1080 

ATCTQGCCGC AOCTAACCTC ACTGTGGAQG AAOGAAAGTC TATCACATTA TCCTGTAGTG 1140 

TG6CM3GTGA T006GTTCCT AATATGTATT GGGATGTTGG TAACCTGGT7 TCCAAACATA 1200 

TGAATGAAAC AAGCCACACA CAGGGCTCCT TAAGGATAAC TAACATTTCA TCOSATGACA 1260 

GTGGGAAGCA GATCTCrTGT GTCGCGGAAA ATCTTGTAGG AGAAGATCAA GATTCTGTCA 1320 

ACCTCACTGT 6CATTTTGCA CX3VACTATCA CATTTCTCGA ATCTCCAACC TCAGACCACC 1380 

ACT G GTGCAT TOCATTCACT GTGAAAGGCA AOCCCAAACC AGOGCTTCAG T6GTTCTATA 1440 

AGGGGGCAAT ATTGAAT6AG TCCAAATACA TCTOTACTAA AATACATGTT ACCAATCACA 1500 

C3GGAGTACCA 060CTGCCTC CAGCTG6ATA ATCCCACTCA CATGAACAAT GGGGACTACA 1560 

CTCTAATAGC CAAGAATGAG TATGGGAAGG ATGAGAAACA GATTTCTGCT CACTTCATGG 1620 

GCTGGCCTGG AATTGAOGAT G6TGCAAACC CAAATTATCC TGATGTAATT TATGAAGATT 1680 

ATGGAACTGC AGGGAAT6AC AT0GG6GAGA CX3U36AACAG AAGTAATGAA ATCCCTTCCA 1740 

CAQACGTCAC TGATAAAACC GGTCGG6AAC ATCTCT06GT CTATGCTGT6 GTGGTSATTG 1800 

OGTCTGTGGT 6GGATTTTGC CTTTTGGTAA TGCTGTTTCT GCTTAAGTTG GCAAGACACT 1860 

CCAAGTTTGG CATGAAAGAT TTCTCATGGT TTGGATTTG6 GAAAGTAAAA TCAAGACAAG 1920 

GTGTTGGCCX: AGCCTCOGTT ATCAGCAAT6 ATGATGACTC TGCCAQCCCA CTCCATCACA 1980 

TCTCCAATGG QAGTAACACT CCATCTTCTT OQOAAGGTGG CCCAGA1GCT G7CATTATTG 2040 

GAATGACCAA GATCCXTTCTC ATTGAAAATC CCCAGTACTT T6GCATCACC AACAGTCftGC 2100 

TCAAOCCAGA CACATTTGTT CAGCACATCA AGOGACATAA CATTGTTCTG AAAAGGGAGC 2160 

TAGGCGAAGG AGCCTTTGGA AAAGTGTTCC TA6CTGAATG CTATAACCTC TGTCCTGAGC 2220 

AGGACAAGAT CTTGGTGGCA GTGAAGACCC TGAAGGATGC CAGTQACAAT GCACGCAAGG 2280 

ACTTCCACOQ TGA06CX:GAG CTC CTGAC CA AOCTCCAGCA TGAGCACATC GTCAAGTTCT 2340 

ATGGCGTCTO GGTGGAGGGC GACCCCCTCA TCATGGTCTT T6AGTACAT6 AASGATGGGG 2400 

ACCTCAACAA GTTCCTCAGG GCACACGGCC CTGATGCCXyT GCTGATGGCT GAGGGCAACC 2460 

CGCCCACGGA ACTGACGCAG TCGCAGATGC TGCATATAGC CXAGCAGATC GCCGCGGGCA 2520 

TGGTCTACCT GGOGTCCCAG CACTTCGT6C ACOGOGATTT G0GCACCAG6 AACTGCCTG6 2580 

TOGGGGAGAA CTTGCTGGTG AAAATCGGGG ACTTTGGGAT GTCOOGGGAC GTGTACA6CA 2640 

CTGACTACTA CAGGGTOGGT GGGCACACAA T6CT6CXX3VT TCGCTGGATG CCTCCAGAGA 2700 

GCATCATGTA CAGGAAATTC AOGACGGAAA G0GAC6TCTG GA6CCTGGGG GTCX3TGTTGT 2760 

GOGAGATTTT CACCTATGGC AAACAGCCCT GGTACCAGCT GTCAAACAAT GAGGTQATAG 2820 

AGTGTATCAC TGAGGGC06A GTCXH^CAGC GACCCCGCAC 6TGCX2CCCAG 6AGGT6TATG 2880 

AGCT6ATGCT G6GGT6C766 CA6GQABAGC OCCACATGAa 6AAGAACATC AAQOGCATCC 2940 

ATACCCTCCT TCAGAACTTG GCCAA6GCAT CTC066TCTA CCTGQACATT CTAGGCTAGG 3000 

GCCOTTTCC CCAGACXXSAT CCTTCCCAAC GTACTCCTOl GAOSGGCTGA GAGGATGAAC 3060 

ATCTTTTAAC TGCCGCTGGA GGCCACCAAC CTGCTCTCCT TCACTCTGAC AGTATTAACA 3120 

TCAAAGACTC OGAGAAGCTC T0GAGGGAA6 CAGTGTGTAC TTCTTCATCC ATA6ACACAG 3180 

TATTGACTTC TTTTTGGCRT TATCTCTTTC TCTCTTTCCA TCTOOCTT96 TTGTTOCTTT 3240 

TTCTTTTTIT AAATTTTCTT TrTCl ' Kmr TTTTTCOTCT TCCCTOCTTC AOGATTCTTA 3300 

CCCTTTCTTT TQAATCAATC TOGCTTCTOC ATTACTATTA ACTCTGCATA GACAAAGGCC 3360 

TTAACAAACG TAATTTGTTA TATCAGCASA CACrCCAGTT TGCCCACCAC AACTAACRAT 3420 

GCCTTGTTGT ATTCCTGCCT TTGATGTGOA TGAAAAAAAG GGAAAACAAA TArTTCACTT 3480 

AAACTTTGTC ACTTCT6CIG TACAGATATC GAGAGTTTCT ATG6ATTCAC TTCTATrTAT 3540 

TTATTATTAT TACTGTTCTT ATTGT7TTTO 6ATGGCTTAA GCCTGTGTAT AAAAAAGAAA 3600 

ACTTGTGTTC AATCTGTGAA GCCTTTATCT ATGGGAGATT AAAACCA6AG AGAAAGAAGA 3660 

TTTATTATGA ACXX3CAATAT GGGAGGAACA AAOACAACCA CTGGGATCAO CTGGTGTCAG 3720 

TCCCTACTTA 6GAAATACTC AOCAACI6TT A6CT6GGAAG AATGTATTOG GCACCTTCCC 3780 

CTGAG6ACCT TTCTGAGGAG TAAAAAGACT ACTGGCCTCT GTGOCATGGA T6ATTCTTTT 3840 

CCCATCACCA GAAATCATAG OGTG CASTAg AGAGCAAAGA TGgCTT CCQT QRGAC ACAftG 3900 

ATGGOGCATA GTGTGCTGGG ACACA5TTTT GTCTTOGTAG QTTGTOATGA TAGCACTOGT 3960 

TTGTTTCTCA AGGGCTATCC ACAGAACCTT TGTCAACTTC AGTTGAAAAG AGGTOGATTC 4020 
ATGTCCAGM3 CTCATTT06G G6TCAG6T6G GAAAGCX: 

Seq ZD KOs 597 Protein sequence 
protein Accession $: AAL67965.1 

1 11 21 31 41 51 

I I 1 1 I I 

M5SWIRHHGP AMARLHGFCW LWGFHRAAF ACPTSCKCSA SRIHCSDPSP GIVAFFRLEP 60 

NSVDPQJITE IFlANQKRIiE IINEDDVEAy VGLRIILTIVD SGLKPVAHKA PLKNSNIiQHI 120 

NPTRNKLTSL SRKHFRHLDL SEblLVQiPP TCSCDIMWIK TliQEAKSSPD TQDLYCLSES 180 

SIOnPLANLQ IPNOGIiPSAN liAAPNLTVEE GKSXTLSCSV AGDPVPKKnf DVGNLVSKHM 240 
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KBTSHTQGSL RITKISSDDS GRQISCVAQT LV6EDQDSVN 
WCXPFTVK6N PKPWm^ GAlUlSSSai CTKZBVmaBT 
LIAKHEYGBZ) EKQISAHFHG WPGIODGANP trmJVIYEDY 
DVTDKTGRBB LSVYkVWIA SWGFCLIiVM LFLLKLARHS 
VGPASVISND DDSASPLBHI SK6SNTFSSS B6GFDAVII0 
KFOTFVQHIK RBNZVLKREL 6BQAFGKVPL ABCYNLCFEQ 
FEREAEZiLIN LQEBHIVKFY 6VCVBGDPLX KVFByKXHGD 
PTELTQSQML HIAQQIAAGM VYIASQHFVB RDIATBKCLV 
DVYRVOGHTM LPIRWMPPES IMYRKFTTES DVWSLGWLW 
CITQGRVI.QR PRTCFQEVYB LMLGCKQREP RMRKHIRGXB 

Seg ZD NOt 598 DKA sequexice 
Nucleic Acid Accession #i AB052906 
Coding sequence: 74.. 814 



LTVHFAPTXT 
BYHGCLQLDH 
GTAAllDIGDT 
KFGMKDFSWF 
KTKZPVZENP 
DKILVAVKTL 
Z^nCFLRAKGP 
GB7LLVKX6D 
BIFXyCSKQPW 
TI*U3HIiAKAS 



AAAACCTTGA 
CTCTGGGTCC 
GCTOCTXaCTG 
CATCAGOGTC 
GGATGAAAAG 
CCT6GG6AAG 
GGTGGTGGAC 
GGAAOCCCTC 
TGGATCTTGG 
AATGTGGACA 
GGTTGTG6CC 
CTTCTTGATG 
CTCAGGCACA 
CATCCTCXXTC 
AAGCTGATAC 
CCAGGTGCCX: 
TGGACCCAAT 
TACCTAACAT 
TTCTGGCTGA 
GTACTTCTTT 
TA6ACTTCAG 
ATAAGAAAAA 
TTTAAATAAA 



11 

I 

GGTGATTCAT 
TTAATGGCAG 
TCCGGCTSGT 
ATCCCTAAGT 
ACTTTTCTTC 
AAACTAAATG 
ATACTTACAG 
ACCCTGCAGG 
CAGTTCAGTT 
AC3GGTTCATC 
ATGTCCTTCC 
G6CATGGACA 
ACCCAACTCA 
TQCTTCATCC 
CAAAAGGCTC 
ACXSAOCTAOG 
AGCTCATTCA 
ATTATOCAAT 
CTAAACAAGA 
GAATGATGAT 
ACCTCTC9GG6 
ATTTATATTA 
GACTTCTATT 



21 

I 

CTTCCAGGCT 
CAGCOGCCGC 
O0CGG6CTG6 
TCA6ACCTG6 
ACTATGACT6 
TCACAACGOC 
AOCAACTGOQ 
CCAGGATGTC 
TCGATGGGCA 
CTGGAGCCAG 
ATTACTTCTC 
GCACCCTGGA 
GOQCCACAGC 
TCCCTGGCAT 
CTGTGAGCAC 
GIGTATGTOC 
CTGCXTTTGAT 
TTTCTCTTC6 
TATATCATTT 
CTCTTTCTTG 
ATTCTTTOOS 
ATGATTGTTT 
TCCCAAAAAA 



31 

I 

CTCCTTCCAT 
TACCAAGATC 
G0GAGC06AC 
ACCAOGGTGG 
TGGCAAOVAO 
CTGGAAAGCA 
TGACATTCAG 
TTGTGAGCAG 
GATCTTCCTC 
AAAGATGAAA 
AATGGGAGAC 
GCCAAGTXSCA 
CACCACCCTC 
CTGAG6AGAG 
GGTCTTGATC 
AGTGGCCTOC 
TCCTTTTGCC 
TGCTACCTGA 
TCTTTCTTCT 
CAAATGATAT 
TOTOCTGAAA 
CCTTTAGTAA 
AAAAAAAAAA 



41 

1 

CAAGTCTCTC 
CTTCTGTGCC 
CCTCACTCTC 
T6T6GG6TTC 
ACAGTCACAC 
CAGAACCCAG 
CTGGAGAATT 
AAAGCTGAAG 
CTCTTTGACT 
GAAAAGTGGG 
TGTATAGGAT 
GGAGCACCAC 
ATCCTTTGCT 
TCCTTTAGAG 
AAACTCGCCC 
A6CAGATCAT 
AACAATTTTA 
TGGAATTCCT 

TX3TCA6IAAA 
6AGAATTTTT 
TTTATTGTTC 
AA 



PTBMlQIGDyT 
TNRSI3EXFST 
GFCKVKSRQG 
QYFGXTNSQL 
KDASOKASKD 
DAVLMAEGNP 
FGMSRDVYST 
YQLSMNEVZB 
PVYLDXDG 



51 
I 

CTCCCTAGCG 
TCCCGCTTCT 
TTT6CTATGA 
AAGGGCAOGT 
CTGTCAGTCC 
TACTGAGAGA 
ACACaCCCAA 
GACACAGCAG 
CAGAGAAGAO 
AGAAT6ACAA 
GQCTT6AGGA 
TOGOCATGTC 
GCCTCCTCAT 
TGACAGGTTA 
TTCTGTCTGG 
GATGACATCA 
CCA6CAGTTA 
GCACTTAAAG 
GGAAAATCAA 
ATAATCACQT 
AAATTATTTA 
T6TACTGATA 



Seq ZD HO: 599 Protein sequence 
Protein Accession ff: BAB61048.1 

21 



51 



1 
I 

GGCTCTCACC 
CXXAGTATCT 
GCCCCAAGGA 
ACTGG6TACA 
ACtACTACAG 
ATTACTTCTT 
ACACCTGTGC 
TCTACGAAGT 
AGGQATCTGT 
CCAOCCCTGG 
GACftOACAGA 
CTTCCTTCTT 
AAACAGTA6C 



11 
I 

CTGCTCTCCT 
GAGTACCCTG 
GQAGGATAGG 
GCOTGCCCTT 
ACOTCOQCTG 
06AOGTAGA6 
CTTCCATGAA 
TCCCTQGGAG 
60CAGGCCAT 
ACTGGTQGCC 
G ftAOgCT GOk 
GCTltTAATA 
ATGQCC 



21 
I 

6CA6CTCCA6 
CTGCTCCTQC 
ATAATCCCOa 
CACTT08CCA 
0Q6GTACTAA 
GTGG6CG6CA 
CAGCCAGAAC 
AACAGAAGGT 
TOGCACCAGC 
OCCACCCTGC 
GGAGTCCTTT 
GOCCTGGTAC 



31 
I 

CTTTGTGCTC 
TOGCCACCCT 
GTGGCATCTA 
TCAGCGA6TA 
6AGCCAS6CA 
CCATAT6TAC 
T6CAGAAGAA 
CCCTGGTGAA 
CACCACCCAC 
OGGAGGOCTC 
GTTGCTCA6C 
ATGGTACACA 



41 
I 

TGCCTCTGAG 
AGCTGTGGCC 
TAACOCAGAC 
TAACAAG6CC 
ACAGAC06TT 
CAA6TCCCA0 
ACA6TTGTGC 
ATCCAGGTGT 
TCCCACCCGC 
COCATGTGOC 
AGGGCGCTCr 
CCCCCCChOC 



Seq ZD NO I 601 Protein sequence 
Protein Accession i: NP^001889.1 

1 11 21 31 41 SI 

i I I 1 I 1 

MAQYLSTLLIi Z.LATLAVALA HSPKEBDRIZ PG61VNADLH DEHVQRAZiHP AISEXKKATX 
DDYYRRPIiSV LRARQOTVGG VHYFFDVEVG RTZCTKSQFN LDTCAFBEQP ELQKXQIiCSP 
EZYE7PWENR, RSLVKSRCQE S 

Seq ZD NO: 602 D2ZA sequence 

Nucleic Acid Accession St NN_003976.2 

Coding sequence: 299.961 



300 
360 
420 
480 
540 
600 
660 
720 
780 



PCTAJS02/12476 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



1 11 21 31 41 

1 I I i 11 

MAAAAATKZL IiCLPTAUiLS 6WSRAGRADP HSIiCyDZTVZ PKFRPGPRWC AVQGOVDEKT 
FIflYDCGNKT VTPV8PLGKK INVTTAWKAQ HPVLRBWDZ LTEQLRDZQL EHYTPREPLT 
LQARMSCBQK AEQB8SGSHQ PSnX3QZPU« FDSERRMHTT VHPGAZUCMKE XNBIDKWAM 
SFHYPSMGDC ZGHItEDFLMG NDSTXiEPSAa APLAMSSGTT QIiRATATTLZ LOCLLZZLPC 
PIZiPGl 

Seq ZD NO: 600 DKA sequence 

Nucleic Acid Accession It NM_00189e.l 

Coding sequence: 57.. 482 



60 
120 
180 
240 



51 
I 

GAGACCATG6 
CTG6CCTGGA 
CTCAATGATG 
ACCAAAGA1G 
GGQGQG8TGA 
CCCAACTTGG 
TCTTTOGAGA 
CAAGAATCCT 
TGTAGT6CTC 
TGC6CCAAGA 
GCOCTCCCTC 
TCCTGCAATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 
120 



11 
I 



21 



31 
1 



41 
I 



51 
I 



419 



wo 02/086443 

CTCTGAGCTT CTCTGM3CCT T Q TTT U CTCR TCTG6AAAAA GGGGftrTAAA GCATTTACCT 60 

CA7GGAG2T8 TGAAAOAATA GC7QCAAAGC ACCTAACACA TAGTAAGGTT CCCAGT6CAG 120 

CTACTTCTGC TGGGTTGAGT CTAGCTGTGT AfiGCCCCTTG TTCCTCACCT GGAGAAACTG 180 

GGGTGGCAC3G COGGTCCCCC ACAAAAGATA ACTCATCTCT TAATTTGCAA GCTGCCTCAA 240 

CAGGA6GGTG OGOGAACAGC TCAACAATGG CTGHTGGGCS CTOCIxa S TG T TGAXAGAGAT 300 

GGAACTTGGA CTT66A66GC TCTCCAOGCT 6TC0CACTGC CCCTGGOCTA GG0G6CA6CC 960 

TGCCCTGT GG CCCACCCTGG CGGCTCTGGC TCTGCTGAGC AGOGTO S CAG AGGCCTCCCT 420 

GSGCTOOSOS CCCCGCAGCC CTGCCCCCCG OGAAGGCCCC CCGCCTGTCC TGGCGTCCCC 480 

OGCOGGCCAC CTGCCGGGGG GAOGCACGGC CCGCTGGT6C AGTGGAAGAG CCCGGCGGCC 540 

GC03CCGCAG CCTTCTCX5GC CCGOOCCCCC GCOGCCTGCA CCCCCATCTG CTCTTCCC06 600 

OQOGG600GC OOGGCGOGGG CTGSGOGOOC G66CA6CD6C GCTO S GGCRQ 0306600808 660 

GGGCTGC06C CTGGGCTCGC AGCTGGTGCC 66I6CG0606 CI06QCCTG6 6CCAC06CTC 720 

OGACGASCTG GTGCGTTTCC GCTTCTGCAG CGGCTCCTGC CGCOGOGCGC GCTCTCCACA 780 

CX5ACCTCAGC CTGGCCAGCC TACTGGGOGC OGGGGCCCTG OGACOGCCCC OGGGCTCC06 840 

QCCC6TCA6C CAGCCXTTGCT 6CC6ACCCAC GCGCTACGAA G066TCTCCT TCATGGACGT 900 

CAACAQCAOC TOGAGAACOG TGGACC G OCT CTOOGCCACC GOCTGOGGCT GCCT GQ GCTG 960 

AGG6CT06CT 0CAG6GCTTT GCAGACTG6A CCCTTACOGG TGGCTCTTOC TGCCIGGGAC 1020 

CCTCCC3GCAG AGTCXX^lCTA GCCAGCGGCC TCAGCCAGGO AOGAAGGCCT CAAAGCTGAG 1080 

AGGCCCCTAC OGGTGGGTGA TGGATATCAT CCCOSAACAG GTGAAGGGAC AACTGACTAG 1140 

CAGCCCCAGA 6CCCTCACCC TGOGGATCCX: AGCCTAAAAG ACACCAGAGA CCTCAGCTAT 1200 

6GAGCCCTTC 6GACCCACTT CTCACAGACT CT66CACIG6 CCAG6CCTOC3 AACC1G06AC 1260 

CCCTCCTCTG ATGAACACTA CAGTG6CTGA GGCATCAGCC CC06CCCAG6 CCCTGTA6GG 1320 

ACAGCATTTG AAOGACACAT ATT6CAGTTG CTTOGTTGAA AGTGOCTGTG CTGGAACTGQ 1380 
GCT6TACTCA CTCATGGGAO CTG600CC 

Seq ID KO: 603 Protein sequence 
Protein Accession i: NP_003967.1 

1 11 21 31 41 51 

I i I I I I 

MELGIiGGXjST LSHCPWPRRQ PALWPTLAAL ALLSSVAEAS LGSAPR5PAP JtEGPPPVLAS 60 

PAGHLPGGRT ARWCSGRARR PPPQPSRFAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 120 

RGCRLRSQLV FVRALGLGHR SDELVRFRFC SGSCRRARSP HDLSLASZiLG AGAIiRPPPGS 180 
RPVSQPCCRP TRYEAVSFND VMSTmiTVDR LSATAOGCLG 

Seq ID NO: 604 DNA sequence 
Hucleic Acid Accession fts NM_057091.1 
Coding sequence: 783.. 1445 

1 11 21 31 41 51 

i I ) I 1 1 

ACTGGCCGCT GAGAGAAGAA T0G6GTGGAG CA6AGAGCAG CTGCTGCAG6 GCAGACAGCC 60 

G6ACCCCCAA ATCT6CA0GT ACCA6CAGTC AGCQOCCCCA 06CASQGACC GGCTTAOCCC 120 

TOGCTCCCGG CCCTCACTCA CTTTCTCCCG CCCTOSGCCC GGCCTC0CA8 CTCTCTACTT 180 

CGOSTGTCTA CAAACTCAAC TCCCGGTTTC CGTGCCTCTC CAC06CT0GA GTTCTCTACT 240 

CTCCATATCC GA6GGGCCCC TCCCAGCATC TACCCCCCTC CCAACCTOGO GGGACCTAGC 300 

CAA6CTAGGG GOGACTGGAT CCGAOGGGTG GAGCAGCCAG GIGAGCCCCG AAAG6T6GGG 360 

0GG6GCAGGG G06CTCCCAG CCCCACCCCQ GGATCTOGTG A06CTGGG6C TGGAATTTGA 420 

CACCGGACG6 CTG0GG06GC GGGCAGGAG6 C7GCTGAGGG ATGGA6TTGG GCCG66CCCC 460 

CAGACAAGGC CCGGGGGCTC C3GCCAGCAGC A6GTCCCTCG GGCCXX3U3CC CTCGCTGCCA 540 

CCCGGGCCTG GAGCXXCACA CCCGAGGGTG CAGACTGGCT GCOVAGGCCA CACTTTT66C 600 

TAAAAQAG6C ACTGCCAGGT GTACAGTCCT GG6CATGCGC TGTTTGA6CT TOGGGGGAGA 660 

OCCCftGCACT GGTCCC066A AA66TGCCTA GAAGAACAA6 GTGCAGGACX: COGTGCTGCC 720 

TCAACAGGAG GGTGGGGGAA CAGCTCAACA ATGQCIGATG GGG6CTCCIG GTGTTGATAG 780 

AGATGGAACT TGGACTTG6A GGCCTCTCCA CX3CTGTCCCA CTCCCCCTOG CCTA6G0GGC 840 

AGCCTGCCCT GTGGCCCACC CTGGCCGCTC TGGCTCTGCT GAOCAGCGTC GCAC3AGGCCT 900 

CCCTGGGCTC OQOOCCCOGC AGCCCIGCCC CCC6CGAAGG CCCCCC G CCT GTCCTGGCGT 960 

CCCCOSCOCn CCAGCTGCCG GG6GGA06CA CGGCCOGCTG GTGCAGTGGA AGAGCC06GC 1020 

G600G006CC 6CAGCCTTCT OGGCCOGOSC CCCO6C06CC TGCACCOOCA TCTGCTCTTC 1080 

CCCGCXK^GGG C0GC6CG6CG CGGGCTOGGG GCCCGOGCAG CCGCGCTOGG GCAGC6GGGG 1140 

CXKX3GGGCTG COGCCTGCGC TCGCAGCTCG TGCCGGTG03 COCGCTCGGC CTGGOCCACC 1200 

6CTCOSAOGA GCTOGTGOGT TTCOGCTTCT GCAGCQGCTC CTGOOGOCXSC GCGOGCTCTC 1260 

CACACGACCT CAOCCTGGCC AGCCTACT6G GOQCCaGGGC CCTGC6A00G CCn:OQGGCT 1320 

CCOGGCCCGT CA6CCA0CCC T6CT0C06AC CCACGCSCTA 0GAA606GTC TCCTTCATGO 1380 

ACGTCAACAG CACCTGGAGA ACCGTGGACC GCCTCTCCGC CACCGCCTtSC GGCTGCCTGG 1440 

GCTGAGGGCT CGCTCCAGGQ CTTTGCAGAC TGGACCCTTA CCX3GTGGCTC TTCCXGCCTQ 1500 

GGACCCTCCC GCAGA6TCCC ACTAGCCAOC GGCCTCA6CC AGGGAOGAAG 6CCTCAAA6C 1560 

TGAGAGGCOC CTA0096TGQ GTGATGGATA TCATCOOOGA ACAGGTSAAO 0QACAAC7GA 1620 

CTAGCAGCCC CAGAGCCCTC AC C CTQO GG A TCCCAGCCTA AAAGACACCA 6AGACCTCAG 1660 

CTATGGA6CC CTTCGGACCC ACTTCTCACA GACTCTGGCA CTGGCCA6GC CTCGAACCTG 1740 

GGACCCCTCC TCTGATGAAC ACTACAGTGG CTGAGGCATC AGCCCCOGCC CAGGCCCTGT 1800 

A6GGACAGCA TTTGAAGGAC ACATATTGCA GTTGCTTGGT TGAAAGTGCC TGTQCTGGAA 1860 
CTGGCXITGTA CTCACTCATG GGAGCTGGCC CC 

Seq ID NO: 605 Protein sequence 
Protein Accession ft: 1IP_003 967.1 

1 11 21 31 41 51 

I I I I I t 

MELGIiGGLST LSHCPWPRRQ PALWPTLAAL ALLSSVAEAS LGSAPRSPAP RB6PPPVLAS 60 

PAGHLPGGRT ARWCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 120 

RGCRLRSQLV FVRALGLGHR SDELVRFRFC SGSCRRARSP HDLSLASLLG AGAIiRPPPGS 180 
RPVSQPCCRP TRYEAV8FMD VHSTWRTVSR LSATAOQCLO 



Seq ID NOt 606 OEIA sequence 

Nucleic Acid Accession ft: NM_057160.1 
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Coding sequence t 1 > • ''K 



PCT/US02/12476 



10 



1 11 21 31 41 51 

I I I I I I 

ATGOXXSGOC TGATCTCAGC COGAGGACAG CCCCTCCTTG AG6TCCTTCC TCCCCAAGCC 60 

CACCTGGGT6 CCCTCTTTCT COCTGAGGCT CCACTT60TC TCTCOGCOCft 0GCT6CCCT6 120 

TG600CAC0C TGGCOGCTCT QGCTCTGCTQ A6CAG0GT0G CAGAG60CTC CCTGGGCTCC 180 

GCGCCXXGCA GCCXTPGCCCC CCGCX3AAGGC CCCCOSCCTG TCCTGGC3GTC CCCOSCCGGC 240 

CACCTGCOGG GGGGACGCAC GGCCGGCTGG TGCAGTGGAA GAGCCCGGOG GCOGCOGCOS 300 

CAGCCTTCTC GCCCCGCGCC CCX^QCOBCCT GOVOCCCCAT CTGCTCTTCC C060GGGGGC 360 

GGCGOGOOSC GGQCTGGQGQ CCOOOQCAGC GGGGCTOQGO CAG0GGGQ6C GGGGG6CT6C 420 

CaOCtGCGCr CGCAGCTGGT GCGGGTQOGC GCGCTOGGCC TGGGCCAC08 CTCOGAOGAS 4B0 

ciyss m sasrr tccgcttctg cagcjggctcc tgcogcogcxs cxkkctctcc acacgacctc s40 

AGCCTGGCCA GCCTACTGGG CGCOGGGGCC CTGOGACOSC CCCC3GGGCTC COSGCCCGTC 600 

15 AGCCAGCCCT GCTGCX^GACC CACGOSCTAC GAAGOGGTCT CCTTCATGGA OGTCAACAGC 660 

ACCTG6AGAA CCGTGGACOQ CCTCTOOGCC A006CCTGCX3 GCTGCCT606 CTGAGGGCTC 720 

GCTOC A G GGC TTT6CA6ACT GGACCCTTAC CGGTG6CTCT TCCT6CCT06 6ACCCTCC0G 780 

CAOAGTCCCA CTAGCCAGCX5 GCCTCAGCCA GGGAC3GAAGG CCTCAAAGCT GAGAGGCOX: 840 

TACCGGTGGG TGATGGATAT CATCCCCGAA CAGGTGAAGG GACAACT6AC TAGCAGCCCC 900 

AGA6CCCICA CCCTGCGGAT CCCA6CCTAA AAGACACCAQ AGACCTCAGC TATGGAGCCC 960 

TTOGGftOOCA CTTCTCACAa ACTCTGGCAC TGCCO^OGCC TC6AACCT0G GACCCCTCCT 1020 

CTGATGAACA CTACAGTGGC TGAGGCATCA GGOCOOGCOC AGGCCCTGT A GGGACAGCAT 1080 

TTQAAGGACA CATATTQCAO TTQCTTGGTT GAAAGIGGCT GTGCTGGAAC TG6CCT6TAC 1140 
TCACTCATOa GAGCTGOOCC C 
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Seq ID HOi 607 Protein sequence 
Protein Acceeeion Ki NP_47650l.l 



1 11 21 31 41 51 

30 I II I I I 

MPGLISARGQ PLLEVLPPQA BLGALFLPEA PLGLSAQPAL WPTLAALALL SSVAEASL6S 60 

APRSPAPREG PPPVLASPAG HLP6GRTARH CSGRARRPPP QPSRPAPPPP APPSAIiPROO 120 

KAARAGGFGS RASAAGARGC SLRSQLVFVS ALGL6BRSDS LVRFRFCSGS OUtARSFBDL 180 
SLASLLQAGA LRPPPGSRPV SQPCCSPTHy BAVSFMDVNS TNRTVDRIiSA TA06CLG 



Sag ID NO I 60B OKA sequence 

Nucleic Acid Accession ft: im_057090.1 

Coding sequence: 29.. 715 

1 11 21 31 41 51 

I I I I I I 

CTGATGGGG6 CTCCT GG TGT TGATAGAGAT GGAACTTGGA CTTGGAGGCC TCTCCACGCT 60 

6TC0CACTSC C0CT6G0CTA GG066CAGGC TOCACTTGGT CTCTCCGGSC AGCCTGOCCT 120 

GTGGCCCACC CT6GCG0CTC TGOCTCTOCT GAGCAGCGTC GCAGAGGCCT CCCTGGGCTC 160 

OGCGCCCCGC AGCCCTGCCC CCOGOGAAQG CCCCCC6CCT GTCCTGGOGT CCCCCGCCGG 240 

CCACCTGCOO GOQGGAOGCA CGGCCCGCTG GTGCAGTGGA AGAGCCCGGC GGCCGCCGCC 300 

GCAGCCTTCT OGGCCCGCGC CCCCGCOGCC TGCACCCCCA TCTGCTCTTC CCCGCGGQGG 360 

CCGCGOQGCG 06GGCTGGG0 0COC6GGCAG COQOQCTGGG 6CAGQGGGGG CQCGGGGCTG 420 

C0GCCT6CGC TC6CAGCTGG TGOCQGTGCG COGOCTCQGC CTGG6CCACC GCTCCGACGA 460 

GCTGGTGCGT TTCCGCTTCT GCAGCGGCTC CTGCCGCCGC GOGOGCTCTC CACACGACCT 540 

CAGOCTGGCC AGCCTACTOQ GOSCOGGGGC CCTGOSACOG CCCCOGGGCT OOOGGCCCGT 600 

CA6CCAGCCC T6CTGCCGAC CCAC60GCTA CGAAGCGGTG TCCTTCATGG ACGTCAACAG 660 

CACCIGGAGA ACCGTSGACC GCCTCTCOGC CAOCGCCTGC G6CT0CCTQ0 GCTOAGSGCT 720 

06CTOCAOGG CTTT6CAGAC TGGACCCTTA COOOrGGCTC TTOCTGCCTG GGAGCCTCCC 760 

GCAGA6TCCC ACTAGCCAGC GGCCTCA6CC AGGGAOQAAG GCCTCAAAGC TGAGAG6CCC 840 

CTACOGGTGG GTGATGGATA TCATOCCCGA ACAGGTGAAG GGACAACTGA CTAGCAGCCC 900 

CAOAGCCCTC ACCCTGOGGA TCCCAGCCTA AAAGACACCA GAGACCTCAG CTATGGAGCC 960 

CTTGGGACCC AJCTTCTCACA 6ACTCTGGCA CTGQOCAGGC CTOGAACCTO GGAOCCCTOC 1020 

TCT6ATGAAC ACIACA0TG6 CT6AGGCATC AQCOCCCGCC CASGCOCTOT AOQG ACAGC A 1080 

TTTQAAG6AC ACATATT6CA OTTQCTTQGT TGAAAOTQCC TGTQCTGGAA CTGGCCTOTA 1140 
CTCACTCATO GGAGCTQOCC CC 

Seq ZD MO: 609 Protein sequence 
Protein Accession #t NP_476431.1 

1 11 21 31 41 51 

1 I I I I I 

MBLGLGGbST I<SHCPWPRSQ APLOLSAQPA LNPTLAALAL LSSVAEASLO SAPRSPAPRB 60 

GPPPVLASPA GBLFGGRTAR WCSGRARRPP PQPSRPAPPP PAPPSAIiPRG GRAARAGGPO 120 

SRARAAGARO CRLRSQLVFV 2tALGLGBRSD ELVRFRPCS6 SCRRARSPHD ZiSLASLLQAG 180 
ALRPPPGSRP VSQPCCRPTR YEAVSFMDVN STMRrVDRLS ATACGCLG 



Seq 10 MOt 610 DNA sequence 
Mudeic Acid Accession 9: Eos sequence 
73 Coding sequence: 1..1746 

1 11 21 31 41 51 

I I I I I I 

^ ATGCCACTGA ACCATTATCT CCTTTTGCTG GTGG6CTGCC AAG C CTG GG G TGCAGGGTTG 60 

oO GCCTACCATG GCTGCCCTAG CGAGTGTACC TGCTCCAGOO CCTCCCAOGT GGAGT6CACC 120 

GGGGCACGCA TTGTGGOSGT GCCCACCCCT CTGCCCTGGA AOGCCATGAG CCTGCAGATC 180 

CTCAACACGC ACATCACTGA ACTCAATGAG TOCCOGTT C C TCAATATCTC AGCCCTCATC 240 

GCCCXGA0GA TTGAGAAOAA TGA6CIGT08 CGCATCAC6C CIGOGGCCTT OOGAAAOCTG 300 

. GGCTCGCTGC GCTATCTCAG CCTCGCCAAC AACAA6CTGC AGGTTCTGCC CATC66CCTC 360 

OJ TTCCAGGGCC TGGACAGCCT TGAGTCTCTC CTTCTCTCCA GTAAOCAGCT GTTGCAGAIC 420 

CAGCOGGCCC ACTTCTCCCA GTGCAGCAAC CTCAAGGAGC TGCAGTTGCA CGGCAACCAC 480 

CIGGAATACA TCCCTGAOGG AGCCTTGGAC CACCTGGTAG GACTCAOGAA GCTCAATCT6 540 
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GGCAAGAMTA GCCTQlOOCA CATCTCftCOC ABGGTCITOC AGCACCTG G G CAATCTCCSUS 600 

GTCCTCOQGC TGTATOAGAA CAGGCIGAOG GATATCCGCA TGGGCACTTT TGATGGGCTT 660 

GTTAACCTGC AGGWVCTGGC TCTACAGCAG AACCAGATTG GACTGCTCTC COCTGGTCTC 720 

TTCC3W3VACA ACCACAACCT CXZAGAGACTC TACCTGTCCA ACAACCACAT CTCCCAGCTG 780 

CCACCCAGCA TCTTCATGCA 6CTGCXXX:AG CTGAACC6TC TTACTCTGTT TG6GAATTCC 840 

CTGAAGGA6C TCTCTCTOGG GATCTTOGQG CCCA7GCCCA A0CTG0G66A GCTTTG6CTC 900 

TATGACRAOC ACATCTCTTC TCTACCCGAC AATGTCTTCA GCAACCTCOG CCAGTTGCAG 960 

GTCCTGATTC TTAGCC33CAA TCAGATCAOC TTCATCTCXr C3GGGTGCCTT CAAC3GGGCTA 1020 

ACGGAGCTTC GGGAGCTGTC CCTCCACACC AAOGCACTGC AGGACCTGGA OGGGAATCTC 1080 

T TOOGCATGT TGGCCAACCT GCAGAACATC TCCCTGCAGA ACAATOGGCT CAGACAGCTC 1140 

0CA606AATA TCTTOGOOUV 0GTCAAT6GC CTCATG6CCA TCCAGCTGCA GAACAACCAG 1200 

CTGGA6AACT TGCCCCTCGG CATCTT06AT CACCIGG6GA AACTGTGTGA GCTGOGGCTG 1260 

TATGACAATC CCTGGAGGTG TGACTCAGAC ATCCTTCOGC TCCXSCAACTG GCTCCTGCTC 1320 

AACCAGCCTA GGTTAOGGAC CGACACTGTA CCTGTGTGTT TCAGOCCAGC CAATGTCOGA 1380 

GGCCAGTCCC TCATTATCAT CAATGTCAAC GTTGCTGTTC CAA60GTCCA TGTCCCTGAG 1440 

GTSCCT A GTT ACCCA6AAAC AOCATGGTAC CCAGACACAC CCASTTAOOC TGAGACCACA 1500 

TOCGTCTCTT CTACCACTGA GCTAACCAGC CCTGTGGAAG ACTACACTGA TCT6ACTACJC 1560 

ATTC3VGOTCA CTGATGACOS CAGOGTTTGG GGCATGACCC AGGCCCAGAQ 0GG6CTGGCC 1620 

ATTGCCGCCA TTGTAATTGG CATTGTCGCC CTGQCCTGCT CCCTGGCTGC CTGCGTCGGC 1680 

TGTTGCTGCT GCAAGAAGAG GAGCCAAGCT GTCCTGATCC AGATGAAGGC ACXTAATGAG 1740 

TGTTAAAGAG GCAGGCTGGA 6CAGGGCT6G GGAATGATGG 6ACTGGAGGA CCTGGGAATT 1800 

TCATCTTTCT GCCTCCACCC CTGGGTCCAT GGAGCTTTCC OGTGATTGCT CTTTCTGGCC 1860 

CTAGATAAAG GTGTGCCTAC CTCTTCCTGA CTTGCCTGAT TCTCOTGTAQ AGAAGCASGT 1920 

OGTGCOGGAC CTTCCTACAA TCAGGAAGAT AGATCCAACT GGCCATGGCA AAAGCCCTGG 1980 

GGATTTCCCA TTCATACCCC TOOGCTTCCT TOGAGAGGGC TCTTCCTCCA AATCCTCXXX: 2040 

ACCTGTCCTC CAAGAACASC CTTCCCTGOS CCCAGGCCCC CTCC3GGGCCT CTGTAGACTC 2100 

AGTTAGTCCA CAGCCTGCTC ACTTOGTGGG AATAGTTCTC CGCTGAGATA GCCCCTCTCQ 2160 

CCTAAGTATT ATGTAAGTTC ATTTOCCTTC TTTTGTTTCT cr X ' G ' m 'GTG CTATGGCTTG 2220 

ACCCAGCATO TCCCCTCAAA TGAAAOTTCT CCXXTTTGATT TTCTGCTCCT GAAGGCAGGG 2280 

TGAGTTCTCT CCTCAAAGAA GACTTCAAAC CATTTAACTG GTTTCTTAAG AGCCXTTCAAT 2340 

CAGCCTGGTT TTGGGGATGC TATGAAAGAG AGAAGGAAAA TCATGCOGCT CAGTTCCTGO 2400 

AGACA6AAGA GCOSTCATCA GTGTCTCACT TGrGATTTTT ATCT6GAAAA GGAAGAAACA 2460 

COCCAGCACA 6CMGCTCA6 CCTTTTAGAG AA66ATATTT CCAAACTOCA AACTTTOCTT 2520 

TQAAAAGTTT AGCCCTTTAA G6AATGAAAT CATGTAGAAT 7TTQGACTTC TAAAAACATT 2580 

AAAATCAGCT TATTAATACQ GGATAGAGAA AGAAATCTGG TGCXTTGGGGG TCCCTGTGTT 2640 

CACXXCTAGA GXTTOTTTTA AAATTTTTAA TTGAAGCATG TGAAGTGTAC STGCAGAAAA 2700 

6TGGGAACAT GATAGTGTAT G6CTTGGTGG ATTTTCACAA ACTGAACATA CCTGTGTAAT 2760 

CAGCATCTAG ACCCAGAOCC AGAGCATCAC AAATATCCCC CATGCTGOQC TTTTCCCAGA 2820 

Q6AGATGGGG GCTTCTQAAG ATG6ACTTAC CTSGGAOCIG CO00CCAT6A GCOVGOAOGQ 2880 

TCCCCCCACA GTCAGOCrCT GCAAAGGCCC CGTGGCCAGO GGTQ6AG6AG AATATGTGGG 2940 

TGTOGAGAGG ATGSGAGACT GTGGCCTGAA CAGGAGATTT TATTATATCT GGAGACCCTO 3000 

A6AGACXCTG AGACCTGGGC CACCATGGCT GGCCAGGTCA GAAGCATCCT GACTGCAGAQ 3060 

OTCOOTQCAG CCACACOCTC TTCCCTGCCA GCAA6TTGTC TGCGOCTCAT OQGAGGOCCC 3120 

TOOCOCTGGA GCCTTCTAT6 GAOGTGATAT GCCT6TATCT GTTTTTAATT TTCATTCTTC 3180 

ACTTAGGGGA AGTGAAATCG CTCAGAGATG AGATCCTTTA ATTGAAAAOQ AAGTGTAAOQ 3240 

GAATCTAGTG TCTTTCTAAT GTGGTAAAAT TCTCCATCAA CATCACAGTC AGCTGGCAGC 3300 

TGAACTTCAG AATCTCACTT ACAGCAGGCG ACACGGGGG7 ACACO SATGG GTC ACACTGQ 3360 

GTCTGGGQGC TC^CTGGAGC TOCTOCTGOO TCrrOGTCTGO TTAGGAGTTQ AGTTGTTTGC 3420 

TOCASGGTTA TTCTCCTCCT OSAGTCACAG TCACAC3GAAT ACCTGCTTTC TCTGGCTTTC 34B0 

CTGCTATACA CATATTCACA TGGOGCTCAA GAAGTTAGGC TCAIGGCAAC GrGTGTCTTT 3540 

CTCTGGACAA CTGGCCCAGT TTACAGTGAA ATGGAGAATT TCAGGTCTCC ACGTCTGCCC 3600 

AGGAAAGAAC TTCAGCTQAC TCCAOGGGGA TCTGGAAATC CAOGACCAAT CCGGATC6GC 3660 

TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGCTTTGO AAATCCAOCA CCAATCCOSA 3720 

TOGGCTCTTA TTAGCTCCCC GCTCCACAAG ACACCTGZGA TCT06AAATC 7ACCACCAAT 3780 

CCCGATOGGC TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGACATOC TCCAGGGCCA 3840 

CACS5AGCACG TGCTGACCAG rTTTCCCTTC CAGTTCCTSC ACAAAAAGTO TCCAGAGGGC 3900 

TGTTTOCAAA CACTAGTGCA CTTTGTAOCT TTTCACCCTC TGTCCCAGGG AATCTAGGAO 3960 

AGATGAOGCC OQTCAGAQTC AAGAOATOIC ATCCCCCCAO 6GTCT0CAA0 GCATTTCCAC 4020 

ACTATTGGTG GCACCTGQAO GACATGCACC AAGGCTTQCC AGAOCCAACA GGAA6TGAGC 4080 

CCA6AGCATG GCACATGAGC ATCACCOGCT GATGGTGGCC TGCTGTGCCT GGTGCCAACA 4140 

GGGGCATCOC GGCCCGTACC CCTCCAGACA GGAA6CATG6 GTTTGCCCAC AGACCTGTCQ 4200 

GGTGCTCCTG T6AGTGGCCT CCAGATGTCT TTGT6CATAG GCACAA6TG0 GCCAGGOCTG 4260 

GAOGGAaSTS OGAAAOCTCA TCATCOSGIG GGCCCTGOCA AICTXAACCC AGAACOCTTA 4320 

GGTATTCCTO GCAGTAGCCA T6ACATTGGA GGACCTTCCT CTCCAGCCAG AGGCTGAOCT 4380 

GAGGGCCACT 6TCCTCAGAT 6ACACCACCC AGGAGCACCC TAQGT6AGGG GTGAGGGCCC 4440 

CCTTATGTGA ACCTCTTGCC TCTTOCTTTC TCCCATCAGA GTGGTTOGAT G6AGCCATTG 4500 

GCCTCCTTTT CTTC3U3CGGG CCCTTCAACC TCTCTGCACC ATGTTGTCTG GCPGAGGAGC 4560 

TACTAGAAAA GCTGAGTGGA QTCTCCTTTC CAACAGGATG ATQCATTTGC TCAATTCTCA 4620 

GGGCTGGAAT GA6CCG6CTG GTCCCCCAGA AAGCTGGAGT GGGGTACAGA GTTCAGTTTT 4680 

CCTCTCTGTT TACAGCTCCT TGACAGTCCC ACGCCCATCT GGAGTGGGAG CTGGQAGTTA 4740 

GT6TTGGAGA AGAAACAACA AAAGCCAATT AGAACCACTA TTTTTAAAAA GTGCTTACTQ 4600 

TGCACftQATA CTCTTCAAGC ACI GGACSXG GATT CTCTCT CTAG COCTC A GCACCCCTGC 4860 

QGTAGQAGTQ 006CCTCTAC OCACTTGTGA TGGGGZACAG AGGCACTTQC TCTTCTGCAT 4920 

GSIGTTCAAT AGGCIGGGA6 TTTTATTTAT CTCTTCAAAC TTT6TACAA6 A6CTCATGGC 4960 

TTGTCTTGGG CTTTOGTCAT TAAACCAAAG GAAATGGAAG CCAnXTCCT GTTGCTCTOC 5040 

TTAGTCTTGG TCATCRGAAC CTCACTTGGT ACCATATAGA TCAAAAGCTT TGTAACCACA 5100 ' 

G6AAAAAATA AACTCTTCCA TOOCTTAAAG AAZAGAATAG TTTGTCOCTC TCATGGGAAT 5160 

TGGGCTQTAT GTATATTGTT C T f O C TO C n AGAATTtAQA GATACAAGAG TTCXACTTA6 5220 

AACTTTTCAT GGACACAATT TCCACAACCT TTCAGATGCT GATGTAGAGC TATTGGGAAA 5280 

GAACTTCCAA ACTCAGGAAG TTTGCAGAGA GCAGACAGCT AGAGATAACT GGGGACOCAG 5340 

AGTTGGTCGA CAGATGTTAG ATGTATCCTA GCTTTTAGCC ATAAACCACT CAAAGATTCA 5400 

6CCCCCAGAT CCCACAGTCA GAACTGAATC TGCGTTQTTG GQAAGCCAGC AGTGGCCTTG 5460 

6GAAGGAAGC CATGGCTGTG GTTCAQAGAG GGTGGGCTGG CAAGCCACTT COGGGGAAAA 5520 

CTOCTTCCGC CCCAGGTTTC TTCTTCTCTT AAGGAGAGAT TGTTCTCACC AACCOGCTGC 5580 

CTTCATGCTG CCTTCAAAGC TAGATCATGT TTGCCTTGCT TAGASUITTA CTGCAAATCA 5640 

GCOXIACTGC TTGGOGATGC ATTTACAGAT TTCTAGGCCC TCAGGGTTTT GTAGAGTGTG 5700 

AGC0CTG6TG GGCAGGGTTG GGGGGTCT6T C Ti ' Ci ' GCmu ATQCTSCTTG TAATCCATTT 5760 
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G6TGTACA6A ATCAACAATA AATAATATAC ATGTAT 

Seq ID MO I 611 Protein sequence 
Protein Accession St BABS4S87.1 

1 11 21 31 41 51 

I I I t I I 

HPLKUYLLLL V60QAH6AGL AYHGCPSECT CSRASQVECT GARIVAVPTP LPWHAKSLQI 60 

UTTHITELKB SFPUflSALI ALRIEKNELS RITPGAFRNL GSLRYLSLAN NKLQVLPIQL 120 

FQGLDSLBSL LLSSSQhUQl QPAHFSQCSN LKELQIiKGHH IiEYIPDGAFD HLVGLTKLNIt IBO 

GXN5LTBZSP RVFQBL(3ILQ VLRLYEtlRLT DZPMGTFDGL VNLQEIALQQ MQIGLLSPGIi 240 

FHNNHNLQRL YLSNMHISQIi PPSIFMQIiFQ UTRLTLFGNS IiKBLSUSZFG PKFNLRELWli 300 

VDNHISSLPD NVFSNLRQLQ VLILSRKOIS PISP6AFHGL TELRELSLHT KAIiQDUXafV 360 

FRMU^MLQNI SLQKNRLRQL PQflFAKVNG LMAIQIiQNKQ LSTLPLGIFD HLGKLCEIiRL 420 

YDMPHRCDSD ZLPLRNWLUi NQPRLGTDTV PVCFSPANVR GQSIiZIINVN VAVPSVHVPB 4 BO 

VPSyPETPWy PDTPSYPDTT SVSSTTELTS PVEDYTDLTT IQVTDDRSVM GMTQAQSGLA 540 
ZAAIVX6IVA IiACSXAACVG CCCCKKRSQA VLMQMXAPNB C 

Seq ID HO 2 612 DNA sequence 
Nucleic Acid Accession 8: XM_098151 
Coding sequence: 1..447 " 

1 11 21 31 41 51 

i I I i 1 I 

ATGATOCATT TGCTCAATTC TCAGGGCTGG AATGAGOCGG CTGGTCCCCC AGAAAGCTGG 60 

AGTGGG6TAC AGAGTTCAGT TTTCCTCTCT GTTTACAGCT CCTTGACAGT CCCAOGCCCA 120 

. TC7GGAGTG6 GAGCTGGGAG TCAGTGTTGG AGAAGAAACA ACAAAAGCCA ATTAGAACCA 160 

CTATTTTTAA AAAGTGCTTA CTGTGCACAG ATACTCTTCA AGCACTGGAC GTCGAttCTC 240 

TCTCTAGCCC TCAGCACCCC TGCX3GTAGQA GTGCOOCCTC TACXXSUTTTG TGATGGGGTA 300 

CAGAGGCACT TGCTCTTCTG CATGGTGTTC AATAGGCTGO GAGTTTTATT TATCTCTTCa 360 

AACTTTGTAC AAGAGCTCAT G6CTTGTCTT GGGCTTTOGT CATTAAACCA AAGGAAATGG 420 
AAGCCATTCC CCTGTTGCTC TCCTTAG 

Seq ZD UK): 613 Protein sequence 
Protein Accession #: XP_098151 

1 11 21 31 41 51 

I I I I I I 

MMHLLNSQGH NEPAGPPBSW S6VQS8VFLS VYSSLTVPRP SGVGAGSQCH KRNBKSQLEP 60 

LFLRSAyCAQ ILFKHIimfZL SLALSTPAVG VPPLPTCDGV QRHLLFCMVF NRLGVLFISS 120 
NFVQBLMACL GXiSSI^QRKM RPFPCC8P 

Seq ZD NOt 614 DNA sequence 

Nucleic Acid Accession 9t KM_0026SB.l 

Coding sequence : 7 7 .. 1 3 7 2 

1 11 21 31 41 51 

I I I I i I 

GTCCC08CA3 O6CCGT0G06 COCTCCTQCC 6CAG6CCA0C GA6GC0800G CC0TCTAGC6 60 

CCXTCGACCTC GCCACCATGA GAGCCCTGCT OQCGOGCCTO CTTCTCTOCO TCCTGGTOST 120 

GAGOGACTCC AAAGGCAGCA ATGAACTTCA TCAAGTTCCA TCGAACTGTO ACTGTCTAAA 180 

TGGAGGAACA T0TGT6TCCA ACAAGTACTT CTCCAACATT CACT66T6CA ACT6CCCAAA 240 

GAAATTGQOA GGGCAGCACT GTGAAATAGA TAASTCAAAA ACCTGCTA7G AGGGQAATGQ 300 

TCACTTTTAC OGftGGAAAGG CCAGCACT6A CAGCATG6GC 0G6OCCTGOC T6CCCTGGAA 360 

CTCTGCCACT GTCCTTCAGC AAAOGTACXA T60CCACAGA TCTGATGCTC TTCAGCTGGG 420 

CCTGGGGAAA CATAATTACT GCAGGAACCC AGACAACOGQ AGOCGACCCT GGTGCTATGT 480 

GCA6GT6GGC CTAAAOCOQC TOXSTCCAAGA 6T6CATGGT6 CATGACTG06 CAGAT6GAAA 540 

AAAGCCCTCC TCTCCTCCAO AAGAATTAAA ATTTCAGTGT 6GCCAAAAGA CTCTG AGGCC 600 

CCGCTTTAAO ATTATT6G66 6AGAATTCAC CACCATOGAG AACCA6G0CT GGTTTGGGGC 660 

CATCTACAGG AGGCACOGGG G66GCTCTGT CACCTACGTG TGTGGAGGCA GCCTCATCAG 720 

CCCTTGCTGG GTGATCAGCG CCACACACTG CTTCATTGAT TACCCAAAGA AGGAGQACTA 780 

CATCGTCTAC CTGGGTOGCT CAAGGCTTAA CTCCAACAOQ CAA6GGQAGA IGAA6TTTGA 840 

0ST6GAAAAC CTCATCCTAC ACAAG8ACTA CAQ08CTQAC AGGCTTGCTC ACGACAACGA 900 

CATTGCCTTO CTGAAGATCC GTTCCAAGGA GGOCAGGTGT GOGCRGCCAT CCCG6ACTAT 960 

ACAQACCATC TGCCTGCCCT CGATGTATAA CX3ATCCCCAG TTTGGCACAA GCTGTQAGAT 1020 

CACTGGCTTT GGAAAAGASA ATTCTACCGA CTATCTCTAT CCGGAGCAGC TGAAAATGAC 1080 

TGTTGT6AA0 CT6ATTTCCC ACCGGGAGTG TCAGCA60CC CACTAC7A03 GCTCTGAAGT 1140 

CACCACCAAA A1GCTATGTQ CTGCT6ACCC CCAAIOGAAA ACA6ATTCCT GCCAGG6AGA 1200 

CTCAGGGGGA CCCCTCGTCT QTTCCCTCCA AGGCCGCATG ACT7TGACT6 GAATTGTGAG 1260 

CTQGGGCCGT GGATGTGCCC TGAAGGACAA GCCAGGCGTC TACACGAGAG TCTCACACTT 1320 

CTTACCCTGQ ATCG6CAGTC ACACCAAOGA AGAGAATGGC CT6GCCCTCT GAGGGTCCCC 1380 

AGGGAGGAAA OGGGCACCAC COGCTTTCTT GCT6GTTGTC ATTTTT6CA0 TAGAGTCATC 1440 

TCCATCA6CT GTAAGAAOAQ ACTGGGAAGA 7AGQCTCTGC ACAGATGGAT TTGCCTGTG6 1500 

CACCACCA60 GTGAA06ACA ATAGCTTTAC CCTCA06GAT AGGCCTGGGT 6CIG6CT6CC 1560 

CAGACCCTCT GGCCAG6ATG GA6GGGTGGT CCTGACTCAA CATGTTACTG ACCAGCAACT 1620 

TOTCTTTTTC TGGACT6AAG CXnX3CAGGAG TTAAAAAGGG CAGGGCATCT GCTOTQCATG 1680 

G6CTCGAA6G GAGAGCCA6C TCCCCOGACC 66TGQGCATT TGTGAGGCCC ATGGTT6AGA 1740 

AATGAATAAT TTCCCAATTA GGAAGTGTAA GCA6CIQAQ6 TCTCTTGAGG GAGCTTA60C 1800 

AATGTGGGAG CAGCGGTTTG GGGAGCAGAG ACACTAA06A CTTCAGGGCA GGGCTCTGAT 1860 

ATTCCATGAA TGTATCAGGA AATATATATG TGTGTGTATG TTTGCACACT T6TTGTGTG0 1920 

GCTGTGAGTG TAAGTGTGAG TAAGAGCTGG TGTCTGATTG TTAA6TCTAA ATATTTCCTT 1980 

AAACTGTGTG GACTGTGATG CCACACAQAS TGGTCTTTCT GGAGAOGTTA TAGGTCACTC 2040 

CT6GGGCCTC TTGGGTCCCC CAOGTGACAO TOCCTGGQAA TGTACnATT CTQCA0CAT6 2100 

ACCT G T G ACC AGCACTGTCT CAGTTTCACT TTGACATAOA T6TC0CTTTC TTGOOOGTr 2160 

ATCCCrrCCT TTTAGCCTAG TTC ATCCAAT CCTCACT600 TGQQGTGAOG AOCACTOCTT 2220 

ACACTQAATA TTTATATTTC ACTATTTTTA TTTATAITTT TGTAATTTZA AATAAAAGTG 2280 
ATCAAXAAAA TGTGATTTTT CTGA 
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Seq ID NO: 615 Protein sequence 
Protein Accession HP_002649.1 



1 11 21 31 41 51 

I I I I I I 

MRALLARUiL CVLWSDSKG SNEUIQVPSN CDCUIGGTCV SNKYPSITDIM CNCPKKFGGQ 60 

HCEIDKSKTC YBGNGHPYRG KASTDTMGRP CLPHNSATVL OQTYHAHRSD ALQLGLGRHN 120 

YCRNPDNRRR PWCYVQVGLK PLVQECKVHD CADGKKPSSP PEELKPQOGQ KTLRPRFKII 180 

GGEFTTZENQ PWFAAZyRRB tKSGSVTYVOG GSLZSPOfVI SA7BCFZDYP XKEDYIVYLG 240 

RSRUrSirFQG EMKFBVENLX LHKDYSADTL AHHNDIALLK ZRSKBQRCAQ PSRTIQTICL 300 

PSHYiaDFQFG TSCBZT6FGK Et^STDYLYPE QLKMIWKLZ SBRBOQQPBY YGSBVTTKML 360 

CAADPQWKTD SOQGDSGGPb VCSXiQCKMTL TGXVSHGKGC ALKDKPGVYT RVSHPLPWIR 420 
L 



Seq ID NO: 616 OKA sequence 

Nucleic Acid Accession S: NM_024422.1 

Coding sequence: 202.. 2907 

1 11 21 31 41 SI 

I I I I I I 

GGCCAAA66A AAAGCCCCTT GGATGAGAG6 CAGGOGCTTC AGAGAAGCIA AGAAAAGCAC 60 

CTCrCOGOGC GCOCCACCrC CTCCGCCTCG OSCTCCTCCT GAGCAGCGGG CCCAGACTGC 120 

GCTCCGGCCG CS3GCCCTCGC CCOGOGGAGC CCTCCTACCC CGGCCCGACG CTCGGCCOGC 180 

GACCTGOCCC GAGCCCTCTC CATGGAGGCA GCCOGCCCCT CCGGCTCCTG GA ACGG AGCC 240 

CTCTGCOGGC TGCTCXTTGCT GACCCTCGCG ATCTTAATAT TTGCCA6TGA TQCCTGCAAA 300 

AATGTGACAT TACATSTTCC CTCCAAACTA- GATGCCGAGA AACTTGrTGG TAGAGTTAAC 360 

CTGAAAGAGT GCTTTACAGC TGCAAATCTA ATTCATTCAA GTGATCCTGA CTTCCAAATT 420 

TTGGAGGATG OTTCAGTCTA TACAACAAAT ACTATTCTAT TGTCCTCX3GA GAAGAGAAGT 480 

TTTACCATAT TACTTTCCAA CACTGAGAAC CAAGAAAAGA A6AAAATATT TGTCTTTTTG 540 

GAGCATCAAA CAAAGGTCCT AAAGAAAAGA CATACTAAAG AAAAAGTTCT AAGGCX5CGCC 600 

AAGAGAAGAT GOGCTCCAAT TCCTTOTTCX5 ATGCTAGAAA ACTCCTTGGG TCCTTTTCCA 660 

CTTTTOCTTC AACAOGTTCA ATCTGACAOS OCCCAAAACT ATACCATATA CTATTCCATA 720 

AGAOGTCCTQ 6AGTTGAGCA AGAACCTOGG AA7TTATTTT ATGTG6AGACS AGACACTGGA 780 

AACTTGTATT GTACTOGTOC TGTAGATOGT GAGCAGTATG AATCTTTTGA GATAATTGCC 840 

TTTGCAACAA CTCCAGATGG GTATACTCCA GAACTTCCAC TGCCCC TAAT AATCAAAATA 900 

GAGGATGAAA ATQATAACTA CCCAATTTTT ACAGAAGAAA CTTATACTTT TACA^TTTTT 960 

GAAAATTGCA GAGTO6GCAC TAC T SIGG G A CAAGTSTOT G CTACZXSACAA AGATGA6CCT 1020 

GACA06ATGC ACACACGOCT GAA6TACTGC ATCATTGOGC AGGTGCCACC ATCACCCACC 1080 

CTATTTTCTA TGCATCCAAC TACAGGGGTG ATCACCACAA CATCATCTCA GCTAGACAGA 1140 

GAGTTAATTG ACAAGTACCA GTTGAAAATA AAAGTACAAG ACATGGATGG TCAGTATTTT 1200 

GGTCTACAGA CAACTTCAAC TTGTATCATT AACATTGATG ATGTAAATGA CCAC TTGCCA 1260 

ACATTTACTC GTACTTCTTA TGTGACATCA 6TGGAAGAAA ATACAGTTGA TGTGGAAATC 1320 

TTA06AGTTA CTGTTGAG6A TAAGGACTTA GTGAATACTG CTAACTG6A6 AGCTAATTAT 1380 

ACCATTTTAA AQGGCAATGA AAATGGCAAT TTTAAAATTG TAACAGATGC CAAAAC CAAT 1440 

GAAGGAGTTC TTTGTGTAGT TAAG CVm G AATTATGAAG AAAAGCAACA GATGATCTTG 1500 

CAAATTGGTG TAGTTAATGA AGCTCCATTT TCCA6AGAG0 CTAGTCCAAG ATCAGC CATQ 1560 

AGCACABCAA CAGTTACTGT TAATGTAGAA GATCAGGATG AGGGOOCTGA GTGTAAOCCT 1620 

CCAATACAGA CTGTTCGGAT GAAAGAAAAT 6CAGAAGTGG GAACAACAAG CAATGQATAT 1680 

AAAGCATAT6 ACCCAGAAAC AAGAAGTAGC AGTGGCATAA GGTAT AAGAA ATTAACT6AT 1740 

CCAACAGGGT GGGTCACCAT TGA7X5AAAAT ACAOGATCAA TCAAAGTTTT CAGAAGCCTG 1800 

GATAGAGAGG CAGAGACCAT CAAAAATGGC ATATATAATA TTACAGTCCT TGCATCAGAC 1860 

CAAGGAG6GA GAACATOTAC GGGGAGACTG G6CATTATAC TTCAAGAOQT QAATQ ATAAC 1920 

AGCCCATTCA TACCTAAAAA GACAGTGATC ATCTGCAAAC CCAOCATQTC ATCTOCGGAO 1980 

ATTGTTGCOG TT6ATCCTGA TGAGCCTATC CATGGCCCAC CCTTTGACTT TAGTCTGGAG 2040 

AGTTCTACTT CAGAAGTACA GAGAATGTGG AGACTGAAAG CAATTAATGA TACAGCAGCA 2100 

CGTCTTTCCT ATCAGAATGA TCCTCCATTT GGCTCATATQ TAGTACCTAT AAO^STGAGA 2160 

GATAGACTTG OCAIGTCTAG TOTCACTTCA TTGGATGTTA CACTGTGTGA CTGC ATTA OC 2220 

GAAAATGACT GCACACATCG TOTAGATCCA AGGATTGOCG GTGGAGGAGT ACAACTTQQA 2280 

AAGTGGGCCA TCCTTGCAAT ATTGTTGGGC ATAGCATTGC TCTTTTGCAT CCTGTTTAC6 2340 

CTQGTCTGTG GGGCTTCTGG GACGTCTAAA CAACCAAAAG TAATTCCTGA TGATTTAGCC 2400 

CAGCAGAACC TAATTGTATC AAACACAGAA GCTCCTGGAG ATGACAAAGT GTATTCTGOG 2460 

AATGGCrrCA CAACOCAAAC TGTGGGOOCT TCTGCTCAG6 6AGTTTGTGG CACGGrGGGA 2520 

TCAGGAATCA AAAAOGGAGG TCAGGAGACC AT0GAAAT66 TGAAAGGAGG ACAOCAGACC 2580 

TOGGAATCCT GCCGGGCGGC TGGCCACCAT CACACCCTGG ACTCCTGCAO GGGAGGACAC 2640 

AOSGAGGTGO ACAACTGCAG ATACACTTAC TOCGAGTGGC ACAOTTTTAC TC^GCCCCGT 2700 

CTTSGTGAAA AAGTGTATCT GTOTAATCAA 6ATGAAAATC ACAAGCATGC CCAAGAC TAT 2760 

GTCCTGACAT ATAACTATGA AGGAAGAGGA TOGQTGGCTO G6TCTGTA66 TT0TTQCM3T 2820 

GAAG6ACAAG AAGAAOATGG GCTT6AATTT TTGGATAATT TG6AGCCCAA ATTTAG6ACA 2880 

CTAGCAGAAO CATOCATGAA GAGATGAGTG TGTTCTAATA AGTCTCTGAA AGCCAGTGGC 2940 

TTTATX3ACTT TTAAAAAAAA TTACAAACCA AGAATTTTTT AAAGCAGAAG ATGCTATTTG 3000 

TGQGGGTTTT TCTCTCATTA TTTG6ATGGA ATCTCTTTGG T CftAATG CAC ATTTACAGAG 3060 

AGACACTATA AACAAGTACA CAAATTTTTC AATTTTTACA TATTTTTAAA TTACTTAT CT 3120 

TCTATCCAAG GAGGTCTACA GAQAAATTAA AGTCTGCCTT ATTTGTTACA TTTQQGTATA 3180 

ATGACAACAG CCAATTTATA OTOCAATAAA ATGTAATTAA TTCAAGTCCT TATTATAQAC 3240 

TATTTGAAOC ACAACCTAAT GGAAAATTGT AGAGACCTTG CTTTAACATT ATCTCCAGTT 3300 

AATTAAGTGT TCATGTGGT6 CTTGGAAACT GTTGTTTTCC TGAAC ATCTA AAGTGTGTAG 3360 

ACTQCATTCT TGCTATTATT TTATTCTTGT AATGTGACCT TTTCACTGTG CAAAGGGAGA 3420 
TTTCTAGCCA GGCATTGACT ATTACAATTT CATT 



Seq ID HO: 617 Protein sequence 
Protein Accession #: NP_077740.1 

1 11 21 31 41 51 

MEAARPSGSM NGALCRLLLL TLAILIPASD ACKNVTLHVP SKLDASKLVO RVNLKECPTA 60 
AMLIHSSDPD FQILEDGSVY TTNTILLSSE KRSFTILLSM TENQEKKKIP VFLEHQTKVL 120 
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KKSHTKEEVL RRAXRRHAPX PCSKLBHSLQ PFFIiFLQQVQ SDTAQHYTIY YSIRGFGVDQ 180 

EPRNLFYVER OTGNLYCTRP VDRBQYESFB IIAPATTPDG rrPELFLPLI IKIEDBOWy 240 

PIPTEETYTF TIFHICRVGT TVGQVCATDK DEKJtMHTRI, KYSIIGQVPP SPTIiFSMHPT 300 

TGVITTTSSQ UJRELIDKYQ liKIKVQDMDO QYFGLQTTST CIINIDDVND HLPTFTRTSY 360 

VT6VE£2ITVD VBIZ^RVTVED XDLVNTAKKR ANYTILRBNE NGHFKIVT13A fCQJESVLCW 420 

KPUIYEBRQQ HlUiVSWSB APFSREASPR SANSTATVTV 2IVQXZDB6PB CNPPIQTVRM 480 

KQ3ABVGTTS KQYKAYDPET RSSSGIRYKK LTDPTGNVTI DENTGSIKVP RSLOREABTI 540 

KKGIYNITVL ASDQGGRTCT GTLGIILQDV ND»SPPIPKK TVUCKPTMS SABIVAVDPD 600 

BPIHGPPFDP SliESSTSEVO RMHRLKADJD TAARLSYQND PPFGSYWPI TVRDRLOeS 660 

VTSLOVTLCD CITEimCTBR VDPRI6GG6V QUSKKULAI LLGXALLFCI LFTlfVOSIlSG 720 

78KQPKVZPD DLAQQULIVS KTBftPGDDKV YSANGFTTt}! VGASAQGV08 TVGSGIKNGQ 780 

QETIEKVRG6 BQrrSBSCRGA GHHBTUJSOt GGRTEVDMCS yTYSEHBSFT QPRLGEKVYL 840 

CHQDENBXBA QDYVLTYNYB GRGSVASSVG 0CSESQEED6 LBFLDMLEPK FRTIAEAOOC 900 
R 

Seq ZD 210 r 618 DNA sequence 

Kuclelc Acid Acceaelon ft: NM_004949.1 

Coding sequencei 202.. 2745 * 

1 11 21 31 41 51 

r I I I I I 

CGCCAAAG6A AAA6CCCCTT G6ATGA6AG6 CA0G06CTTC AGA6AAGCTA AGAAAASCAC 60 

CTCTCCGCGC GCXXCACCTC CTCOSCCTCG 06CTCCTCCT GAGCAGCGGG CCCAGACTGC 120 

GCTCCGGCCG OGGCCCTCXK: CCCGCGGAGC CCTCCTACCC CGGCCOGACG CTCX3GCCCGC 180 

GACCTGCCCC GAGCCCTCTC CATGGAGGCA GCCOGCCCCT CC3GGCTCCTG GAAOGGAGCC 240 

CTCTGCOGGC TGCTCCTGCT GACCCTCX5CG ATCTTAATAT TTGOCAGTGA TGCCTOCAAA 300 

AATGTGACAT TACATGTTCC CTCCAAACTA GATGCCQAGA AACTTGTTGO TAQAGTTAAC . 360 

CTGAAAGAGT GCTTTACAGC TGCAAATCTA ATTCATTCAA GTGATCCTGA CTTCCAAATT 420 

TTGGAGGATG GTTCAGTCTA TACAACAAAT ACTATTCTAT TGTCCTCGGA GAAGAGAAGT 480 

TTTACCATAT TACTTTCCAA CACTGAQAAC CAAGAAAAGA AGAAAATATT TGTCTTTTTG 540 

GAGCATCAAA CAAAGGTCCT AAAGAAAAGA CATACTAAAG AAAAAGTTCT AAGGCGCGCC 600 

AAGAGAAGAT GGGCTCCAAT TCCTTGTTCG ATGCTAGAAA ACTCCTTGGQ 7CCTTTTCCA 660 

CTTTTCCTTC AACAGGTTCA ATCTGACAOS GCCCAAAACT ATACCATATA CTATTCCATA 720 

A6AG6TCCTG 6AGTTQACCA AGAACCTC60 AATTTATTTT ATGTGGAQA6 AGACACTGGA 780 

AACTTGTATT GTACTCQTCC TOTAGATCGT GAGCAGTATG AATCTTTTGA GATAATTGCC 840 

TTTGCAACAA CTCCAGATGG GTATACTCCA GAACTTCCAC TGCCCCTAAT AATCAAAATA 900 

GIVGGATGAAA ATGATAACTA CCCAATTTTT ACAGAAQAAA CTTATACTTT TACAATTTTT 960 

6AAAATT6CA GAGTGOOCAC TACTSTGGCSA CAAGTGTGT6 CTACTGACAA AGATGAGCCT 1020 

GACAOGATGC ACACACOOCT GAAGTACTOC ATCATTOGGC AOGTGCCACC ATCACCCACC 1080 

CTATTTTCTA TGCATCCAAC TAC3W3GC3GTG ATCACCACAA CATCATCTCA GCTAGACAGA 1140 

GAOTTAATTG ACAAGTACCA GTTGAAAATA AAAGTACAAG ACATGGATGG TCAGTATTTT 1200 

GGTCTACAGA CAACTTCAAC TTGTATCATT AACATTGATG ATGTAAATGA C CACT TGCXrA 1260 

ACATTTACTC GTACTTCTTA T6TGACATCA GIGQAAQAAA ATACAGTTGA TGTGGAAATC 1320 

TTAOGAGTTA CTGTT6AGGA TAAGGACTTA GTQAATACTS CTAACTQGAS A6CTAATTAT 1360 

ACCATTTTAA AGGGCAATGA AAATGGCAAT TTTAAAATTG TAACAGATGC CAAAACCAAT 1440 

GAAGQAGTTC TTTGTOTAGT TAAGOCTTTO AATTATGAAG AAAAGCAACA GATGATCTTG 1500 

CAAATTGGTO TAGTTAATGA AGCTCXATTT TCCAGAGAGG CTAGTCCAAG ATCAGCCATG 1560 

AGCACAGCAA CAGTTAC7GT TAATGTAGAA GATCA6GAT6 AGG6CCCT6A GTGTAACOCT 1620 

OCAATACAGA CTGTTOGCAT GAAAGAAAAT GCA6AA6TG0 GAACAACAA6 CAATGGATAT 1680 

AAAGCA7ATG ACCCAGAAAC AAGAAGTAGC AGTGGCATAA GGTATAAGAA ATTAACTGAT 1740 

CCAACAGGGT GOGTCACCAT TGATGAAAAT ACAGGATCAA TCAAAGTTTT CAGAAGCCTG 1800 

GATAGAGAGG CAGAGACCAT CAAAAATG6C ATATATAATA TTACAGTCCT TGCATCA6AC 1860 

CAAG6AGGGA OAAGATOTAC G06GACACTG QGCATTATAC TTCAAGAOST 6AAT8ATAAC 1920 

AGCCCATTCA TACCTAAAAA QACAGT6ATC ATCTGCAAAC OCACCATGTC ATCTGOQGAO 1980 

ATTGTTGCGG TTGATCCTGA TGAGCCTATC CATGGCCCAC CXTTTTGACTT TAGTCTQGAG 2040 

AGTTCTACTT CAGAAGTACA GAGAATGTGG AGACTGAAAG CAATTAATGA TACA6CAGCA 2100 

OGTCTTTCCT ATCAQAATGA TCCTCCATTT QGCTC ATATG TA GTACCTA T AACAGTGAGA 2160 

6ATAGACTTG QCATGTCTAO TGTOICTTCA TT6GATQTTA CACTGTGTGA CTGCATTACC 2220 

6AAAATQACT 0CACACATO8 TGTASATCCA AGGATTGGGG GTGGAG6AGT ACAACTT6GA 2280 

AAGTGGGCCA TCCTTGCAAT ATTGTTGGGC ATAGCATTGC TCTTTTGCAT CCTOTTTACG 2340 

CTGGTCTGTG GGGCTTCTGG GACGTCTAAA CAACCAAAAG TAATTCCTGA TGATTTAGCC 2400 

CAGCAGAACC TAATTGTATC AAACACAQAA QCTCXTrGGAG AT GACAAAG T GTA TTCTG CG 2460 

AAJGGCTTCA CAACCCAAAC TGTGGOCQCT TCT6CTCAG6 QAGTTTGTGG CAC08TQGGA 2520 

TCAGGAATCA AAAACGGAG6 TCAG6AGACC AT0QAAATG6 TGAAAGGAGG ACACCAGACC 2580 

TOGGAATCCT GC0GG66GGC TG6CCACCAT CACACCCTGG ACTCCTGCAG GGGAQGACAC 2640 

AOSGAGGTGG ACAACTGCAG ATACACTTAC TCGGAGTG6C ACAGTTTTAC TCAGCCOCGT 2700 

CfTGGTG A AG AATCCATTAG AGGACACACT CTGATTAAAA ATTAAACAAT GAAAGAAAGT 2760 

GTATCTGIOT AATCAAGATG AAAATCACAA GCATGCCCAA QACTATGTCC TGACATATAA 2820 

CTAT6AAGGA AGAGGATOGG TGGCTG6GTC TGTAGGTTGT T6CAGT6AAC GACAA8AAGA 2880 

AGATGGGCTT GAATTTTTGG ATAATTTGGA GCCCAAATTT AGGACACTAO CA6AAGCATG 2940 

CATGAAGAGA TGAGTGTGTT CTAATAAGTC TCTGAAAGCC AGTGGCTTTA TGACTTTTAA 3000 

AAAAAATTAC AAACCAAGAA TTTTTTAAAG CAGAAGAT6C TATTTGTGG6 6GTTTTTCTC 3060 

TCATTATTTQ GATGOAATCT CTTTGGTCAA ATGCACATTT ACAGAGAGAC ACTATAAACA 3120 

AGTAQVCAAA TTTTTGAATT TTTACATATT TTTAAATTAC TTATCTTCTA TCCAAGGAGG 3180 

TCTACA6AGA AATTAAAGTC TGCCTTATTT GTTACATTTG GGTATAATGA CAACAGCCAA 3240 

TTTATAGT6C AATAAAATGT AATTAATTCA AGTCCTTATT ATAGACTATT TGAAGCACAA 3300 

CCTAATGGAA AATTGTAGAG ACCTTQCTTT AACATTATCT CCAGT TAATT AAGTXnTCAT 3360 

GTGGTQCTTO GAAACTGTTG TTTTOCTGAA CATCIAAAOT GT6TAGACT6 CATTCTTOCT 3420 

ATTATTTTAT TCTTGTAATG TGAOCTTTTC ACTGTQCAAA GG6AGATITC TAG0CAQ6CA 3480 
TTQACTATTA CAATTTCATT 

Seq ZD NOt 619 Protein sequence 
Protein Accession ftt liJP_004940.1 

1 11 21 31 41 51 

I I 1 I i 1 

MEAARP5GSN mkUCRLLLL TLAZLZFASD ACKZnmiHVP SKLDABKLVG RVNLKBCFTA 60 
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AmiZHSSDPD FQILEDGSVY TTNTILLSSB KRSFTIUiSN TQIQERXXIF VFIiEBQTKVL 120 

KKRBTKEKVL RSAXRRNAPI PCSMLENSLO' PFPLFZiQOVQ SDTAONyTZY YSZHGPGVDQ 180 

EPRNLFYVER DTGNLYCTRP VDRBQyESFB IIAPATTPDG YTPBLPLPLI IKIEDENDNY 240 

PIPTBBTYTP TIPENCRVGT TVGQVCATDK DBPDTMHTRL KYSIIGQVPP SPTLFSKHPT 300 

5 TGVITTTSSQ LDREUIDKYQ UtlKVQDKDG QYFGLQTTST CIINIDOVND HLPTPTRTSY 360 

VTSVEBnVD VEZLRVTVED XDLVNTANKR ANyTILKONB IRSNFRXVTSA XmSGVXiCVV 420 

KPUTYEEKQQ MZLQXGWNB APPSRBASPR SAMSTA7VTV NVEDQDB6PE CKPPZQTVRM 480 

KESAEVGTTS NC5YKAYDPET RSSSGIRYKK LTDPTGWVTI DENTGSIKVP RSLDREAETI 54 0 

KNGIYNITVI, ASDQGGRTCT GTLGIILQDV NDNSPFIPKK TVIIC3CPTT4S SAEIVAVDPD 600 

10 EPIEGPPFDF SLBSSTSEVQ RMWRLKAZND TAARLSYQND FPFCSYVVFI TVKDRLG^SS 660 

VTSIiDVTIiCD CZTEKDCTKR VDPRZGOQGV QLGKHAZIAZ tiLGZALLFCZ LFTXiVCGASG 720 

TSKQPKVZPD DLAQQNLZVS NTBAPGDOKV YSANGFTTQT VGASAQ6V06 TVGSGZXKGG 780 

QBTZQfVKCG HQTSESCR6A GHHHTLDSCR GGHTBVDNCR YTYSEWRSFT QPRLGEBSZR 840 
(HITLZKK 



15 



25 



60 
65 
70 



Seq ZD KOt 620 DMA sequence 

Nucleic Acid Accession ft: NM_03254S.l 

Coding sequence* 46.. 718 



20 1 11 21 31 41 51 

) I I I I I 

AAACTGATCT TCAAT6CACT AAGAQAAGGA GACTCTCAAA CCAAAAATGA CCTGGAGGCA 60 

CCATGTCAGG CTTCTGTTTA CGGTCAGTTT GGCATTACAG ATCATCAATT TGGGAAACAG 120 

CTATCaAAGA GAGAAACATA ACG60GGTAG AGAGGAAGTC ACCAAGGTTG CCACTCAGAA 180 

GOVC06ACAG 7CACOGCTCA ACTGGACX7C CAGTCATTTC GGAGAGGTGA CTGGGAG06C 240 

GGAGG6CTGG 6G6C06GA0G A6C0GCTCCC CTACTCOOQO GCTTTOGGAO A3QGT6COTC 300 

CG0GGG6CCG CGCTGCTGCA 6GAA00GCGG TACCTQ08TG CTGGGCSUSCT TCT606TGTG 360 

CCOGGCCCAC TTCACOGGCC GCTACTGCX5A GCATGACCAG AGGCGCA(5TG AATGOGGCGC 420 

CCTGGAGCAC GGAGCCTOGA CCCTCCGCGC CTGOCACCTC TGCAGGTCCA TCTTCGGGGC 480 

30 CCTGCACTGC CTCCCCCTCC AGAOGCCTQA CCGCTGTGAC CCGAAAGACT TCCTGGCCTC 540 

CCAOGCTCAC GGGCCGAGCG CCGGGGGCGC GCCCAGCCTG CTACTCTTGC TGCCCTGCGC 600 

ACTCCTGCAC CGCCTOCTGC GCCC3GGATGC GCCCGC3GCAC CCTCGGTCCC TGGTCXXrrTC 660 

CGT0CTOCA6 OGGGAGOOGC GCCOCTGOOG AAGGOO^OGA CET6G6CATC 6CCTTXAATT 720 

TTCTATGTTG TAAATAATAG ATGTGTTTAG TTTACOGTAA GCT6AA6CAC TGQGTGAATA 780 

35 TTrrTATTOG GTAATAAATA TTTTCATGAA A60Q0CAAAA AAAAAAAAAA AAAAAAAAAA 840 



40 



Seq ZD NO: 621 PzDtein sequence 
Protein Accession ft: NP 115934.1 



1 11 21 31 41 51 

I I I I I t 

NTHRKRVRLIi PTVSIALQZZ NLGNSYQREK BNGQREEVTR VATQKKRQSP LNWTSSHFGB 60 
VTGSAEGWSF BBPLPYSRAF GE6ASARPRC CRHQOTCVLO SPCVCPABFT GRYCEBDQRR 120 
45 8EC6ALEKQA HTLRACKLCR ClFGAiaCLP LQTPDRCDPK DFLASBAHGP SAGGAPSltLIi 180 
' LLPCAIiliHRIi LRPOAPABPR SLVPSVI^RE RRPOCSRPGLO HRL 

Seq ZD MO: 622 DNA sequence 
Nucleic Acid Accession ftt FGENBSB predicted 
50 Coding sequence: 1..390 

1 11 21 31 41 51 

I I I I I t 

ATGAG6TTCA GTGTCTCAGG CATGA6GACC GACTACOCCA 6GAGTGTGCT GGCTCCTOCT 60 

55 TATGTGTCAG TCTC5TCTCCT CCV C nXS^lMr CCAAGGGAAG TCAT06CTCC OGCTGGCTCA 120 

6AACCATGGC TGTGCXAOCC GGCACCCAGG TGTtSGAGACA AGATCIACAA CCCCTTGGAG 180 

CAGTG C TgiT ACAATGACGC CATOGTGTCC CTGA60GAGA CCOGCCAATG TGGTCXXXIO C 240 

TGCACCnCT G600CTGCTT TGAGCTCT6C T6TCTTOATT 0CTTT6Q0CT CACAAAOQAT 300 

TTTGTTGTGA AGCTCAAGGT TCAQGGTGTG AATTGCCAGT GCCACTCATC TCCCATCTOC 360 
A6TAAATGTG AAAGAGGC06 GATATGTTAG 



Seq ZD NO: 623 Protein sequence 
Protein Accession ft: VGESESB predicted 

1 11 21 31 41 51 

] i I I I 1 

MRFSVSQ4RT DYPRSVLAPA YVSVOjLIiXiC PREVIAPAGS EPWLCQPAPR CGDKZYNPLB 60 
QCCVNDAZVS LSBTRQOSPP CTFWPCFBLC CLDSFGLTMD FWXLKVQGV NSQCBSSPX8 120 
SKCERGRZC 

Seq ID NO: 624 DNA Beqcence 
Nucleic Acid Accession ft: M1872B.1 
Coding sequence! 51.. 1085 

75 1 11 21 31 41 51 

1)1)11 

GGAGCTCAAG CTCCTCTACA AA0AGGTX3GA CAC3A6AAGAC AGCASAGACC ATGGGACCCC 60 

CCTCAOCCCC TCCCTOCAGA TTGCATGTCC CCIG GA AGGA GGTOC T GCTC ACAGCCTCAC 120 

TTCTAAOCTT CTGGAACCCA OOCACCACTG CCAAGCTCAC TATTGAATOC AOGCCATTCA 180 

80 AT6TOGC3«SA GGGGAAGGA6 GTTCTTCTAC TCGCCCACA^ CCTGCCCCAG AATCGTATTG 240 

6TTACAQCT6 GTACAAAGGC 6AAAGA6TGG ATGGCAACAQ TCTAATTGTA 6GATAT6TAA 300 

TAGGAACrCA ACAAGCTACC CCAGG6CC00 CATACAGTGG TCGAGAGACA ATATACCCCA 360 

AT6CATCCCT 0CT6ATCCM3 AACQTCACCC AGIkATGACU; AOSATTCTAT ACCCTACAA6 420 

TCATAAA6TC ASATCTTGTG AAT6AA6AAG CAACCGGACA GTTCCATGTA TAC008GAGC 480 

85 TGCCCAAGCC CTCCATCTCC AOCAACAACT OCAACCCC6T GGAGGACAAO 6ATGCTGTGG 540 

CCTTCACrCTG TGAACCTGAG GTTCAGAACA CAACCTACCT GTGGTCGGTA AATGGTCAGA 600 

GCCTCC0S6T CftGTCCCAGG CTGCAGCTGT CCAATC6CAA CATGACCCTC ACTCTACTCA 660 
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GOOTCAAAAG GAAOGATSOV GGAT0CTAT8 AUTGTGAAM ACA6AA0CCA G06AGTGCCA 720 

ACOGCAOTO^V CCCAGTCACC CTGAATGTCC TCTATGGCCC AGATGTOCCC ACCATTTCOC 780 

CCTCAAAGGC CAATTACOGT CCAGGGGAAA ATCTCAACCT CTCCTGCCAC GCAGCCTCTA 840 

ACCCAOCTGC ACAGTACTCT TGGTTTATCA ATGGGAOSTT CX^GCAATCC ACACRAGAGC 900 

TCTTTATCCC CAACATCACT GTQAATAATA 6GGGATCCTA TATX3TGCCAA GCCCATAACT 960 

CftQCCACrGG CCTCAATAGG ACCACAGTCA OQATGATCAC AGTCrCIOGA ACTGC T OCTQ 1020 

TCCTCTCA0C T8TG6CCACC GTOQOCATGA 06ATTSSAGT GCTG O OCAGG GTGK3CTCTGA 1080 
TATAGCAGCC CTGGTGTATT TTCGATATTT CAGGAAGACT GGCAGATTGG ACCAGACCCT .1140 

GAATTCTTCT AGCICCTCCA ATCCCATTTT ATCCCATGGA ACCACTAAAA ACAAGGTCTG 1200 

CTCTGCTOCT GAAGCCCTAT ATGCTGGAGA TGGACAACTC AATGAAAATT TAAAGGC5AAA 1260 

ACXXTCAG6C CTGAfiGTGTG TQOCACICAG AOACTTCACC TA ACTAQAGA CAGTCAAACT 1320 

6CAAACCATQ GT6ASAAATT GA06ACTTCA CACTATGGAC AGCTTTTGCC AAGATGTCAA 1380 

AACAAGACTC CTCATCATGA TAAGGCTCTT ACCCCCTTTr AATTTGTCCT TGCTTATGCC 1440 

TGCCTCTTTC GCTTGGCAGG ATGATGCTGT CATTAGTATT TCACAAGAAG TAGCTTCAGA ISOO 

GGGTAACTTA ACAGAGTGTC AGATCTATCT TGTCAATCCC AACX3TTTTAC ATAAAATAAO 1560 

AGATOCTTTA GT6CACCCAG T6ACTGACAT TAGCAGCATC TTTAACAGAO COSTGTGTTC 1620 

AAATGTACAG TGGTCCTTTT CAGAfiTTOGA CTTCTAGACT CACCTGTTCT CACTCCCTGT 1680 

TTTAATTCAA CCCAGCCATG CAATOCCAAA TAATAGAATT GCTOOCTACC AGCTGAACAG 1740 

GGAQGAGTCT GTGCAGTTTC TGACACTTG7 TGTTGAACAT GGCTAAATAC AAT5GGTATC IBOO 

GCTGA6ACTA AGTTGTAGAA ATTAACAAAT tfitiCmcriti GTTAAAATGG CTACACTCAT 1860 

CItSACTCATT CTTTATTCTA TTTTAGTTGG TTTGTATCTT GGCTAAOGTG C8TAGT0CAA 1920 

CTCTTGGTAT TACCCTCXTA ATA6TCATAC TAGTAGTCAT ACTCCCTGGT GTAGTGTATT 1980 

CTCTAAAAGC TTTAAATGTC TGCATCCAGC C3U3CCATCAA ATA6TGAATG GTCTCTCTTT 2040 

GGCTGGAATT ACAAAACTCA GAGAAATGTG TCATCA6GAG AACATCATAA CCCATGAAGG 2100 

ATAAAAGCCC CAAATGGTGG TAACTGATAA TA6CACTAAT GCTTTAAGAT TTGGTCACAC 2160 

TCTCACCTAO GTGA6GGCAT TGA6CCA0TO GT6CTAAATG CTACATACTC CAACTGAAAT 2220 

GTTAAGGAAG AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAA A GA 2280 

ACACRGGAGA TTOCAGTCTA CTTOAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 2340 

TTTACAAAAA AGTAACCTQA ACTAATCTGA TGTTAACTAA TGTATTTATT TCTGTGGTTC 2400 

TGTTTCCTTG TTCCAATTTG ACAAAACCCA CTGTTCTTGT ATTGTATTGC CCAGGOGGAG 2460 

CTATCACTGT ACTTGTAGAG TGGTGC TO CT TTAATTGATA AATCACAAAT AAAAGCCAAT 2520 
TAGCTCTATA ACT 

Seq ID KO: 625 Protein sequence 
Protein Accession 8: AAA59907.1 

1 11 21 31 41 51 

I I I I I I 

KGPPSAPPCR LEVFWKSVLL TASLLTFWITP PTTAKLTIES TPHTVAEGKB VLLLAHKLPQ 60 

NRIGYSWYKG ERVDCaiSLIV GYVIGTOQAT PGPAYSGRBT lYPNASLLIQ NVTQNDTGPY 120 

TLQVIKSDLV NEEATGQFHV YPELPKPSIS SNNSI^PVEDK DAVAPTCEPB VQNTTyLWWV 180 

NGQSLFVSPR LQLSHGNMTX* TLLSVKRNDA GSYECEIQNP ASANRSDPVT LI7VLYGFDVP 240 

TISPSKANYR PGBHUniSCB AASHPPAQYS WFIllGTFQQS TQZLPIFNXT V8NSGSYM0Q 300 
ABNSATGIHR TTVTHITVSO SAPVLSAVAT VOZTIGVLAR VALZ 

Seq ID KO: 626 DNA sequence 
Nucleic Acid Accession fi: M18728.1 
Coding sequence t 13 55.. 165 7 

1 11 21 31 41 51 

I i I I I I 

GGAGCTCAAG CTOCTCTACA AAGAGGT6GA CAGAGAAGAC AGCA6AGACC ATGG6ACCCC 60 

CCTCAGCCCC TOGCTGCAGA TT6CAT6TCC 0CTGGAAQ6A GGTCCTQCTC ACAGCCTCAC 120 

TTCTAAGCTT CTGGAACCCA CCCACCACZG CCAAOCTCAC TATTGAATCC ACGCCATTCA 180 

ATGTOGCAGA GGGGAAOGAG GTTCTTCTAC TOGCCCACAA OCTGCCCCAG AATCGTATTG 240 

GTTACAOCTG GTACAAAGGC GAAAGAGTGG ATGGCAACAG TCTAATTGTA GGATATGTAA 300 

TAGGAACTCA ACAAGCTACC CCA0G6CXXG CATACAGTQG TOGAGAGACA ATATAOCCCA 360 

AT6CATCGCT GCT6ATGCAG AAGGTCACCC AGAATGACAC AOGATTCZAT AOCCTACAAO 420 

TCATAAAGTC AGATCTTGTG AATGAAGAAG CAAC06GACA GTTCCATGTA TAC0C3QGAGC 480 

TGCCCAAGCC CTCCATCTCC A6GAACAACT CCAACCCOQT 0GAG6ACAAG GATX5CTG7GG 540 

CCTTCACCTO TQAACCTGAG GTTCAGAACA CAACCTACCT GTGGTGGGTA AATGGTCAGA 600 

GCCT0CXX3GT CAGTCCC3U3G CT6CAGCTGT CXAATGGCAA CATGACCCTC ACTCTACTCA 660 

GC36TCAAAA6 6AA06AT6CA 6GATCCTATG AATXnGAAAT ACAGAACCCA GO3A0TOGCA 720 

ACG6CAGTGA CCCAGTCACC CTGAATGTCC TCTATGGCCC AGATGTCCCC AOCATTTCCC 780 

CCTCAAAGGC CAATTACOGT CCAGGGGAAA ATCTGAACCT CTCCTGCCAC GCAGCCTCTA 840 

AOCCACCTQC ACAGTACTCT TGGTTTATCA ATGQGAOGTT CCAGCAATCC ACACAAGAGC 900 

TCTTTATCCC CAACATCACT GTGAATAATA GOGGATCCTA TATGT6CCAA GCCCATAACT 960 

CAOCCACTGG CCTCAATAGG ACCACAGTCA C6ATGATCAC AGTCTCTGGA AGTGCrCCTG 1020 

TCCTCTCAGC TGTGGCCACC GTOQGCATCA CGATTGGAGT GCT0GCCA6G OTGOCTCTGA 1080 

TATAGCAGCC CT GG TGT A TT TTCGATATTT CAGGAAGACT GGCAQATTGG ACCAGACCCT 1140 

GAATTCTTCT AGCTCCTCCA ATCCCATTTT ATCCCATGGA ACCACTAAAA ACAAGGTCTG 1200 

CTCTGCTCCT GAAGCCCTAT ATGCTGGAGA TGGACAACTC AATGAAAATT TAAAGGGAAA 1260 

A0CCTCAG6C CTQA08T0T0 TGCCACTCAG AGACTTCACC TAACTAGAGA CAGTCAAACT 1320 

GCAAACCAT6 GTGAGAAATT GACGACTTCA CACTATGGAC AGCTTTTCCC AAGATGTCAA 1380 

AACAAGACTC CTCATCATGA TAAGGCTCTT ACCCCCTTTT AATTTGTCCT TGCTTATGCC 1440 

TGCCTCTTTC GCTTGGCAGG ATGATGCTGT CATTAGTATT TCACAAGAAG TAGCTTCAGA 1500 

GGGTAACTTA AGAGAGTGTC AGATCTATCT TGTCAATCCC AAOGTTTTAC ATAAAATAAG 1560 

AOATCCTTTA OIGCAOCCAO TGACT6ACAT TAGCAGCATC TTTAACACA6 O O G TOT Qnx; 1620 

AAATGTACAG T aaTCCrm ' CA6AGTTG6A CTTCTAGACT CACCTGTTCT CACTCCCTGT 1680 

TTTAATTCAA CCCAGCCATG CAATGCCAAA TAATAGAATT GCTCCCTACC AGCTGAACAG 1740 

GGAGGAGTCT GTGCAGTTTC T6ACACTTGT TGTT6AACAT GGCTAAATAC AATGGGTATC 1800 

GCTGAGACTA AGTTGTAGAA ATTAACAAAT GTGCT GCTTG GTTAAAATGG CTACACTCAT 1660 

CTGACTCATT CTTTATTCTA TTTTAGTTGG TTTGTATCTT GCCTAAGGT6 GOTAGTCCAA 1920 

CTCTT GG T A T TAOCCTCCTA ATAOTCATAC TAGTAGTCAT ACTCCCTGGT GTAGTGTATT 1980 

CTCTAAAAGC TTTAAATGTC TGCATGCAGC CAGCCATCAA ATAOTGAATO GTCTCTCTTT 2040 

GGCTGGAATT ACAAAACTCA GAGAAATGTG TCATCAGGAG AACATCATAA CCCATGAAGG 2100 

ATAAAAGCCC CAAATGGTGG TAACTGATAA TAGCACTAAT GCTTTAAGAT TTGGTCACAC 2160 
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TCTCAOCTAG GTGAGOGGTlT TGAGCCAGTG GTGCTAAiVTG CTACATACTC OUCTGAAAT 2220 

GTTAftGGAAG AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 2280 

ACACAGGAGA TTCCAGTCTA CTTGAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 2340 

TTTACAAAAA AGTAACCTGA ACTAATCTGA TGTTAACCAA TGTATTTATT TCTGTGGTTC 2400 

• iWA ' A t XVm TTOCAATTTG ACAAAACCCA CTGTTCTTCT ATTGTATTGC CCAGGGGGAG 2460 

CTATCACTGT ACTTGXAGAG TGGIGCTGCT TTAATTCATA AATCACAAAT AAAA6CCAAT 2520 
TAGCTCTATA ACT 

Seq ID NO.* 627 Protein sequence 
Protein Accession fit AAAS9908.1 

1 11 21 31 41 51 

I I I I I I 

MDSFSQDVKT RLLIMIRLLP PFNLSLLMPA 5PAHQDDAVI SISQBVASEG NLTEOQIYLV 60 

NPNVLHKIRD PLVHPVTDIS SIFNTAVCSN VQWSPSELDF 

Seq ID MOi 628 VSA sequence 
Nucleic Acid Accession fi: H18728.1 
Godlng sequence: 2370. .2501 

1 11 21 31 41 51 

I I I I I I 

GGAGCTCAAO CTCCTCTACA AAGAGGTGGA CAGAGAAGAC AGCAGAGACC ATGGGACCCC 60 

CCTCAGCCCC TCCCTGCAGA TTGCATGTCC CCTGGAAGGA GGTCCTGCTC ACAGCCTCAC 120 

TTCTAACCTT CTGGAACCCA CCCACCACTG CCAAGCTCAC TATTGAATCC ACGCCATTCA 180 

ATGTOGCAGA GGGGAAGGA6 GTTCTTCrAC T060CCACAA OCTGCOOCAG AATCGTATTO 240 

GTTACAGCTG GTACAAAGGC GAAAGAGTGG ATGGCAACAG TCTAATT6TA GGATATGTAA 300 

TAGGAACTCA ACAAGCTACC CCAGGGCCCG CATACAGTGQ TCGAGAGACA ATATACCCCA 360 

ATGCATCCCT OCTGATCCAG AAOGTCACCC ACAATGACAC AGGATTCTAT ACCCTACAAG 420 

TCATAAAGTC AQATCTTGT6 AATGAAGAAG CAACCGGACA GTTCCATGTA TACC0G6A6C 480 

TGCCCAAGCC CTCCATCTCC A6CAACAACT CCAAOCOOGT GGAGGACAAO GATGCTGIGS 540 

CCTTCACCTG TGAACCTGAG GTTCAGAACA CAACCTACCT GTGGTG(^A AAT6GTCA6A 600 

6CCTCC0GGT CAGTCCCAGG CTGCA6CTGT CCAATGGCAA CATGACCCTC ACTCTACTCA 660 

GCGTCAAAAG GAAC6ATGCA GGATCCTATG AATGTGAAAT ACAGAACCCA GCGAGTGCCA 720 

ACC6CAGTGA CCCAGTCACC CTGAATGTCC TCTATG6CCC AGATGTCCOC ACCATTTCCC 780 

CCTCAAAOGC CAATTACOGT CCAGGGQAAA ATCTGAACCT CTCCTGCCAC GCAGCCTCTA 840 

ACCCAOCTGC ACAGTACTCT TGGTTTATCA ATGGGAGGTT CX^GCAATCC ACACAAGAGC 900 

TCTTTATCCC CAACATCACT GTGAATAATA GCGGATCCTA TATGTGCCAA GCCCATAACT 960 

CAGCCACTGO CCTCAATAGG ACCACAGTCA CGATGATCAC AGTCTCTGGA AGTX3CtCCTG Z020 

TCCTCTCAGC TGTGGCCACC GTOGGCATCA OGATTGGAGT GCTGGCCAGG 6TGGCTCTGA 1080 

TATAGCAGCC CTGGTGTATT TTCGATATTT CAGGAAGACT GGCAGATTGG ACCAGACCCT 1140 

GAATTCTTCT AGCTCCTCCA ATCCCATTTT ATCCCATGGA ACCACTAAAA ACAAGGTCTG 1200 

CTCTOCTCCT OAAGCCCTAT ATGCTOGAGA TGGACAACTC AATGAAAATT TAAAGGGAAA 1260 

ACCCTCAG6C CTGAGGTGTG TGCCACTCAa AQACTTCACC TAACTAQAGA CAGTCAAACT 1320 

GCAAACCATG GTGAGAAATT QACGACTTCA CACTATQGAC AGCTTTTCCC AAGATGTCAA 1380 

AACAAGACTC CTCATCATGA TAAGGCTCTT ACCCCCTTTT AATTTGTCCT TGCTTATGCC 1440 

TGCCTCTTTC GCTTGGCAGG ATGATGCTGT CATTAGTATT TCACAAGAAO TAGCTTCAGA 1500 

GGGTAACTTA ACAGASTGTC AGATCTATCI T6TCAATCCC AAOSTTTTAC ATAAAATAAG 1560 

A6ATGCTTTA 6T6CACCCAO TQACTQACAT TA0CA6CATC TTTAACACAO CCGTST Q TTC 1620 

AAATGTACAG TGGTCCTTTT CAQAGTTGGA CTTCTAQACT CACCTGTTCT CACTCCCTGT 1680 

TTTAATTCAA CCCAGCCATO CAATGCCAAA TAATAGAATT GCTCCCTACC AGCTGAACAG 1740 

GGAOGAGTCT GTGCAGTTTC TGACACTTGT TGTT6AACAT GGCTAAATAC AATGGGTATC 1800 

GCTGAGACTA AGTTGTAGAA ATTA A CAAAT GIGCTCCTTO QTTAAA ATOa CTA CACrC AT 1860 

CT6ACTCATT CTTTATTCTA TTTTAGTTGG TTTGTATCTT GGCTAAGGTO GGTA6TCCAA 1920 

CTCTTGGTAT TACCCTCCTA ATAGTCATAC TAGTAGTCAT ACTCCCTGGT GTAGTGTATT 1980 

CTCTAAAAGC TTTAAATOTC TGCATOCAGC CAGCCATCAA ATAGTGAATO GTCTCTCTTT 2040 

GGCTGGAATT ACAAAACTCA GAGAAATGTG TCATCAGGAG AACATCATAA CCCATGAAGG 2100 

ATAAAAGOOC CAAATG0T6G TAACTOATAA TAGCACTAAT GCTTTAAGAT TTGGTCACAC 2160 

TCTCACCTAG GT6AGC6CAT T6AGCX»GTQ GTGCTAAAT6 CTACATACTC CAACTGAAAT 2220 

G7TAAGGAAG AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 2280 

ACACAGGAGA TTCCAGTCTA CTTGAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 2340 

TTTACAAAAA AGTAACCTGA ACTAATCTGA TGTTAACCAA TGTATTTATT TCTGTGGTTC 2400 

TGTTTCCTT6 TTCCAATTTG ACAAAACCCA C l ' G ' rJ 'C rmT ATTGTATTGC CCAGGGGGAG 2460 

CTATCACraT ACTT6TAGAG TGGTGCT6CT TTAATTCATA AATCACAAAT AAAAOOCAAT 2520 
TAGCTCTATA ACT 

Seq ID HO: 629 Protein sequence 
Protein Accession «: AAA59909.1 

1 11 21 31 41 51 

1 I I i I I 

MLTNVFXSW LFPCSNLTKP TVLVLYCPGG AITVLVEWCC PNS 



Seq ID NO: 630 DHA sequence 

Huclelc Acid Accession it l]M_016639.1 

Coding sequence: 40.. 429 

41 SI 

I I 

TGGCTOGGGG CTC6CTGCGC 60 
TGCTG06CTC CGT GGOC GGG 120 
CCTGGAGOGC GGACCT9GAC IBO 
GOGACTTCTG CCTGGGCTGC 240 
TCCTTOGGGG OGCTCTGAGC 300 
GGAGAOGATG COGCAGGAGA 360 
GCTGCCCAGC TGTGG06CTG 420 



GCGGCGGGCG CA6ACAGCG0 
OGGTTGCTGC GGCTCCTGGT 
GAGCAAGCGC CAQQCACGGC 
AA6TGCATG6 ACTGOGOGTC 
GCTGCAGCAC CTCCTGCCCC 
CTGACCTTCG TGCTGGGGCT 
GAGAAGTTCA CCACCCCCAT 



21 31 

I I 
CGGGGGCAG6 AOGTGCACTA 
GCTGGG6CTC TGGCTGGOGT 
CCCCTGCTCC Q60G6CA6CT 
TTGCAGG6C6 OGACOGCACA 
CTTCCGGCTG CTTTGGCCCA 
GCTTTCTGGC TTTTTGGTCT 
AGAGGAOACC G606GAGAGG 
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ATCCAGTGAC AATGTGCCCC CTGCCAQOOG OQGCTOGCCC ACTCATCATT CATTCATCCA 480 

TTCTAGA6CC AOICTCTGCC TCCCAOAOGC GG0G66A6CC AAGCTCCTCC AACCACAAOG 540 

QGGGT6GGGG GOQGTX»ATC ACCTCTGAGG CCTGGGCCCA OOGTTCAOGG GAACCTTOCA 600 

AGGTGTCTGG TTGCCCTGCC TCTGGCTCCA GAACAGAAAG GGAGCCTCAC GCTGGCTCAC 660 

ACAAAACA5C TGACACTGAC TAAGGAACTO CAGCATTTGC ACAGGGGAGG G66GT6C0CT 720 

CCTTCCTTAG GA0CTGG6G0 CCA6GCTGAC TTGQGG GG C3V GACTTGACAC TAGGCOCCAC 760 

TCACTCA6AT GTCCTGAAAT TCCACCACGQ GOGTCACCCT GGGQQGTTAG GGAOCTATTT 640 

TTAACACTAO GGOCTGGCCC ACTAGGA6GG CTGGCCCTAA GATACAQACC CCCCCAACIC 900 

CCCAAAGCGG GGftGGAGATA TTTATTTT6G GGAGA G TTT G GAGGGQAG6G A6AATTTATT 960 
AATAAAAGAA 7CTTTAACTT TAAAAAAAAA AAAAAAAA 

Seq ID NOt 631 Protein sequence 
Protein Accession Si NP_057723.1 

1 11 21 31 41 51 

I I I I I I 

MARGSLRRLL RLLVL6LMLA LLRSVAGEQA PGTAPCSRGS SHSADLDKCM DCASCRARPH 60 

8DFCLGCAAA PPAPFRLLWP XLGGALSLTF VLGLtiSGFLV NRRCRRfiBRF TTPIBBT06B 120 

GCPAVALZQ 

Seq ID HOi 632 DNA sequence 
Nucleic Acid Accession #t NN_003616.1 

Coding sequence: 79.. 2538 

1 11 21 31 41 51 

i I I I I I 

CGGCAGGGTT GGAAAATGAT GGAAGAGGCX3 GAGGT6GA0G CGAC06AGTG CTGAGAG6AA 60 

CCTGC3GGAAT CGGCCGAGAT GGGGTCTGGC GCGOGCTTTC CCTCXaSGGAC CCTTCGTGTC 120 

CGGTGGTTGC TCTTGCTTGO CCTGGTOGGC CCAGTCCTOG GTGCGGCGCG GCCAGGCTTT 180 

CAACAGACCT CACATCTTTC TTCTTATGAA ATTATAACTC CTTGGAGATT AACTAGAGAA 240 

AGAAGAGAAO CCCCTAGGCC CTATTCAAAA CAAGTATCTT ATGTTATTCA GGCTGAAGGA 300 

AAAGAGCATA TTATTC3WrrT GGAAAGQAAC AAAGACCTTT TGCCTGAAGA TTTTGTGGTT 360 

TATACTTACA AGAAOGAAOG GACTTTAATC ACTGACCATC CCAATATACA GAATCATTGr 420 

CATTAT0QG6 6CTATGTGGA GGGAGTTCAT AATTCATOCA TTGCTCTTAG CGAClVriTT 480 

GGACTCAGAG GATtGCTGCA TTTAGAGAAT GOQAGTTATG G6ATTGAACC CCTGCA6AAC 540 

AGCTCTCATT TTGAGCACAT CATTTATOGA ATGGATGATG TCTACAAAGA GCCTCTGAAA 600 

T6TGGAGTTT CCAACAAGGA TATAGAGAAA GAAACDGCAA AGGATGAAGA GSAAGAGCCf 660 

CCCAGCATGA CTCAGCTACT TC6AAGAAGA AGAGCTGTCT TGCCACAGAC COGGTATGTG 720 

GAGCIGTTCA TTGTGGTAGA CAAGGAAAGG 7ATGACA7GA TGGGAAGAAA TCAGACTGCT 780 

GTGAGAGAAG AQATGATTCT CCTGQCAAAC TACTTGGATA GTATGTATAT TATGTTAAAT 840 

ATTOGAATTO TGCTAGTTGG ACTGGAGATT TGGACCAATG GAAACCTGAT CAACATAGTT 900 

G6GGGTGCTQ GTGATGTGCT GGGGAACTTC 6T6CAGTGGC GGGAAAAGTT 7CTTATCACA 960 

OOTOGGAGAC ATGACAGTGC ACAGCTAGTT CTAAAGAAAQ aTTTTQOTGQ AACTGCAGGA 1020 

ATG6CATTTG TGGGAACAGT GTGTTCAAGG ASCCAOQCAO G0GG6ATTAA TGTGTTTGGA 1080 

CAAATCACT6 TGGAGACATT TGCTTCCATT GTTGCTCATQ AATTGGGTCA TAATCTTGGA 1140 

ATGAATCACG ATGATGGGAO AGATTGTTCC TGTGGAGCAA AGAGCTGCAT CATGAATTCA 1200 

GGAGCATCGG 6TTCCAGAAA CTTTAGCAGT TGCAGTGCAQ AGGACTTTGA GAAGTTAACT 1260 

7XAAATAAAG GAGGAAACTG CCTTCTTAAT A7TCCAAA0C CTGATGAAGC CTATA0T6CT 1320 

CCCTCCTGTG GTAATAAOTT G6TGGA0GCT GOGGAAGAOT GT8ACTGT6G TACTCCAAAO 1360 

GAATGTGAAT TGGACCCTTG CTGOGAAGGA A6TACCT0TA AGCTTAAATC ATTTGCTGAG 1440 

TGTGCATATG GTGACTGTTG TAAAGACTGT CGOTTCCITC CAGGAGGTAC TTTATGCCGA 1500 

OGAAAAAOCA GTGAGTYSTGA TGTTCCAGAG TACTGCAATO GTTCTTCTCA GTTC7GTCAG 1560 

CCA6ATGTTT TTATTCAGAA TGGATATCCT TGCCA6AATA ACAAA6CCTA T1GCTACAAC 1620 

GGCATGTGCC AGTATTATGA T6CTCAATGT CAAGTCATCT TTGGCTCAAA AGCX»AG6CT 1680 

6CCCGCAAAG ATTGTTTCAT TGAAOTGAAT TCTAAAGGTG ACAGATTTGG CAATTGTGGT 1740 

TTCTCTG G CA ATQAATACAA GAAOTGTGCC ACTOGGAATO Ci'i'iXnViXl W AAAGCTTCAG 1800 

TGXGA6AATG TACAAGAGAT ACCTGTATTT GGAATTGTGC CTGCTATTAT TCAAACXK:CT 1860 

AGT06AGGCA CGAAATGTTG GGGTGTGGAT TT0CA6CTA6 GATCAGATGT TCCAGATCCT 1920 

666ATGGTTA AOGAAGGCAC AAAAT6TGGT GCTGGAAAGA TCIGTAGAAA CTTCCAGTGT 1980 

GTA6ATGCTT CTGTTCTGAA TTAT6ACTGT GATCTTCAGA AAAAG^CaTCA TSGACATGG6 2040 

GTATGTAATA GCAATAAGAA TTGTCACTGT tSAAAATGGCT GGGCTCCCCC AAATTGTGAG 2100 

ACTAAAGGAT ACGGAG6AAG TGTGGACAGT GGA CCTACAT ACAATGAAAT 6AATACTGCA 2160 

TTGAOGQAGG GACTTCTOGT CTTCTTCTTC CTAATTGTTC COCTTATTGT CTGTGCTATT 2220 

TTTATCTTCA TCAAGAGGQA TCAA C TOTGG ASAA6CTACT TCAGAAASAA 6AGATCACAA 2280 

ACATATOAGT CAGATGGCAA AAATCAAGCA AACCCTTCTA 6ACAGCCGG0 GAGTGTTCXrT 2340 

OGACATGTTT CTCCAGTGAC ACCTCCCAGA GAAGTTCCTA TATAT6CAAA CAGATTTGCA 2400 

GTACCAACCT ATGCAGCCAA GCAACCTCAO CAGTTCOCAT CAAGGCCACC TCCACCACAA 2460 

COSAAAGTAT GATCTCAGGa AAACTTAATT C CfGCCQg rC CTGCTCCTGC ACCTOCTTTA 2520 

TATAGTTCCC TCACTTOATT TTTTTAACCT TCTTTTT6CA AATGTCTTCA 6GGAACT6AG 2560 

CTAATACTTT TTTTTTTTCT TGATGTTTTC TTGAAAAGCC TTTCTGTTGC AACTATGAAT 2640 

GAAAACAAAA CACCACAAAA CAGACTTCAC TAACACAGAA AAACAQAAAC TGAGTOTGAO 2700 

AGTTGT6AAA TACAAGGAAA TGCAGTAAA6 CCAGGGAATT TACAATAACA TTTOOGTTTC 2760 

CATCATTGAA TAAGTCTTAT TCAGTCAT06 GTGA6GTTAA TGCACTAATC ATQGATTTTT 2620 

T6AACAT0TT ATTGCAGTGA TTCTCAAATT AACTGTATT6 GT6TAAGATT TTTGTCATTA 2880 

A6TGTTTAAG T6TTATTCTG AATTTTCTAC CTTAGTTATC ATTAAIGTAG TTCCTCATTG 2940 

AACATGTGAT AATCTAATAC CTGTGAAAAC TGACTAATCA GCTGCC3UVTA ATATCTAATA 3000 

TTTTTCATCA TGCA06AATT AATAATCATC ATACTCTAGA ATCTTGTCTG TCACTCACTA 3060 

CATGAATAAG CAAATATTGT CTTCAAAAGA AIGCACAAGA ACCACAATTA AGAT6TCATA 3120 

TTATTTTGAA AGTACAAAAT ATACTAAAAG ASTGTGTGTG TATTCACGCA GTTACTC6CT 3180 

TCCATTTTTA TGACCTTTCA ACTATAGGTA ATAACTCTTA GAGAAATTAA TTTAATATTA 3240 

OAATTTCTAT TATGAATCAT GTGAAAGCAT GACATTGQTT CACAATAGCA CTATTTTAAA 3300 

TAAATTATAA GCTTTAAGGT AOGAAGTATT TAATAGATCT AATCAAATAT GTTGATTCAT 3360 

6GCTATAATA AAQCAGGA6C AATTATAAAA TCTTCAATCA ATTGAACTIT TACAAAACCA 3420 

CTTGAGAATT TCATGA6CAC TTTAAAATCT GAACTTTCAA A6CTTGCTAT TAAATCATTT 3480 

AGAATGTTTA CATTTACTAA GGTGT6CTGG GTCATGTAAA ATATTAGACA CTAATATTTT 3540 

CATAGAAATT AGQCTGGAGA AAGAAGGAA6 AAATGGTTTT CTTAAATAOC TACAAAAAAG 3600 

TTACTGTCGT ATCTATGAGT TATCATCTTA GCTGTGTTAA AAATQAATTT TTACTATGGC 3660 



429 



5 

10 
15 
20 
25 
30 
35 
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70 
75 
80 
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WO 02/086443 

A&AtATGGTA TGQAT06TAA AATTTTAAGC ACTAAAAATT TTTTCATAAC CTTTCATAAT 
AAAGTT7AAT AATAGGTTTA TTAACTGAAT TTCATTAOTT TTTTAAAAGT GrrmWri ' 
TGTGTATATA TACAtAXACA AATACAACAT TTACAATAAA TAAAATACTT GAAATTCTCA 
AAAAAAAAAA AAAAAAAAAA AAAAA 



PCT/US02/12476 



3720 
3780 
3840 



8eg ZD NOt 633 Protein eequence 
Protein Accession S: HP.003 607.1 



N6SGARFPSG 
PYSKQVSYVI 
B6VHNSSIAL 
DIEKETAKDB 
LLANYLDSMY 
AQLVLKKGFG 
RDCSGGAKSC 
LVDAGBECSC 
DVPEyC39GSS 
lEVNSKGDRP 
HGVDPQI.GSD 
HCHCENGHAP 
OQUmSYFRK 
KQPQQPPSRP 



IX 

QAE6KEBIIH 
SDCFGLRGLL 
EEEPPSMTQL 
IMWIIRIVI»V 
GTAGMAFVGT 
INHSGASGSR 
GTPKECELDP 
QFCQPDVFIQ 
GNOGPSGNEY 
VPDPGMVNBG 
FNCETKGYGO 
XRSQTYESDQ 
PFPQPKVSSQ 



21 
I 

GLVGPVL6AA 
LERMXDLLPB 
KLBiASYOIB 
liRRilRAVLPQ 

VCSRSBAGGX 
NPSSCSAEDF 
CCBGSTCKLK 
KGYPCQNNKA 
KKCATGNALC 
TKCGAGKICR 
SVDSGPTYKE 
KNQAKPSRQF 
GMZiIPARPAP 



31 
I 

RPCffQQTSHL 
DFVWTYHKB 
PLQKSSHFBH 
TRYVELPIW 
INZVG6AGDV 
KVFGQITVBT 
EXLTLNRG6N 
SFAECAYGDC 
YCYNQ4CQYY 
GKLQCENVQE 
NFQCVDASVL 
MtrrALSDGLL 
GSVPRBVSPV 
APPLYSSLT 



41 

1 

SSYBIZTPHR 
GTLinUPHZ 
IIYRMDDVYK 
DKERYDKMGR 
IXajPVQWREK 
FASZVABELG 
CLLNZPKFDB 
CKDC31FLPQG 
DAQOQVIPGS 
IPVFGIVPAI 
HYDCDVQKKC 
VFPFLIVPLI 
TPPREVPIYA 



51 
i 

LTRBRREAPR 
QtlBCHYRGYV 
EPUCGGVSNK 
KQTAVREEMI 
FLITRRRHDS 



Seq ID NO: 634 DUA sequence 

Nucleic Acid Accession #: 1IM_002091.1 

Coding sequences 56.. 503 



AGTCTCTGCT 
0GQCA6TGAG 
A6G6GTCC0Q 
CCACTGGGCG 
TGAGAGAGGO 
GAATTT6CTG 
GGCCTT6GGC 
AGGTTCAAAA 
CCCCCASCTG 
TAA6AGACTG 
AAATATTT6A 
CTTCTGGTTT 
TTTTTATATC 
TAAAA6CTTA 



11 
I 

CTTCCCAGCC 
CTCC06CTGG 
CT6CCTGGC3G 
GTGGGGCACT 
AGCCTGAAOC 
GGTCTCATAO 
AATCAG CftGC 
G6CAAAGTT6 
AACCAGCAAT 
AGTTCTGCAA 
CTATTCTGTA 
AAACTTGTTT 
TAGGCTACCT 
AACACAT 



21 
I 

TCTCCGG06C 
TCCTGCTGGC 
G0G6AGG6AC 
TAATGGGGAA 
AGCAGCTGAG 
AAGCAAAGGA 
CTT06T6GGA 
GTAGACTCTC 
GATAATGATG 
GCATCAGTTC 
TCTTTCATCC 
GCTGTGAACA 
GTTQGTTAGA 



31 
I 

GCTCCAAGG6 
GCTGGTCCTC 
C3CTGCTGACC 
AAAGAGCACA 
AGAGTACATC 
GAACAGAAAC 
TTCAGAGGAT 
TGCTCCAGGT 
GCCTCTCTCA 
TACGGATCAT 
TTQACTAAAT 
ATT6T06AAA 
TTCAAG6GCC 



41 

I 

CTTCCCGTCG 
TQCCTAGOGC 
AAGATGTACC 
GGGGAGTCTT 
AGGTGGGAAG 
CACCAGCCAC 
AGCAGCAACT 
TCTCAACGT6 
AAA6A6AAAA 
CAACAAGATT 
TCGTGATTTT 
AGAGTCTTCC 
0GAGCIX3TTA 



AYSAPSCGKR 
TLCR6KTSEC 
KAKAAPKZX:? 
IQTPSRGTKC 
HGRGVOISNK 
VCAIFZFZKR 
NRFAVFTYAA 



SI 
I 

GGA0CATGC8 
0009GG6GG6 
CXX3Q0GQCAA 
CTTCTGTTTC 
AAGCTGCAAG 
CTCAACCCAA 
TCAAAGATGT 
AAGGAAGGAA 
ACAAAACCCC 
TCCTTGTGCA 
CAA6CAGCAT 
AATTAAT6CT 
OCATTGACftA 



Seq ID NO: 635 Protein sequence 
Protein Accession «i HP_002062.1 



41 



51 



1 11 21 31 

I t I I i i 

HRGSSLPLVL LALVLCLAPR GRAVPLPA6G GTVLTKMYPR GNHHAVGHLM GKKSTGESSS 
V8ERQSLXQQ LREXIRWEBA ARNIiLGLIBA KENSNHQPPQ PKALCaiQQPS MDSEDS8NFK 
DVGSK6ECVGH LSAPGSQRBG RNPQLNQQ 

Seq ID NO: 636 DNA sequence 

Nucleic Acid Accession «i NM_016S22.1 

Coding sequences 265.. 1299 



1 

1 

GCGGAAGCAQ 
CTGGCAAAAO 
■m'TCI'OCTC 
CCQCACCCCA 
TG6GQGAAGT 
TGCCTCGTGO 
AGOGGAGAIG 
GCCACCCTCA 
ACCATCCTCT 
AAC ACCCA AA 
TACACCTGCT 
CAAGTATCTC 
ATTAGCCTC3V 
TCTCCCAAAG 
COGSAACAGT 
CGGAGAGTAA 
GTCCOOGTOG 
TTCCAGTG6T 
AACAGAOCTT 
TACACTTGOG 
CCAG60GCCG 
CTGOCTCITC 
CG6GAAAGGC 
CCAATCAGAT 
GGGAGGOGAA 
CCntSCAQAT 



11 
I 

CGAGGAGGGA 
CCGAGGCTQO 
CCCQOQCCTC 
CCCACTTCCT 
T6TGGCTGTC 
TCGTGTCrCT 
CCACCTTCCC 
GGTGCACTAT 
ATGCTGG6AA 
CGCAG7ACA0 
CGGT6CAGAC 
CCAAAAntSr 
CCTGCATAGC 
OGGTTGGCTT 
CAGCGGACTA 
AGGTCACOCST 
GACftAAAGGG 
AOVAGGATGA 
TCCTCTCAAA 
TG6CCTCCAA 
TCAGOGAGGT 
TGGTCTTGCA 
T6C0G0CACC 
ATATACAAAT 
CAAAGAATAC 
ATTTA6QTAC 



21 

I 

GCCCCCTTTG 
ATTTG060GA 
CCGGTCGCGG 
GTGCTOGCCC 
GAGAAIt;GGG 
CAGGCT6CTG 
GAAAGCTATG 
TGACAACOQG 
TGACAAGTOG 
CATCGAGATC 
AGACAACCAC 
AGAGATTTCT 
AACT6GTAGA 
TGTGAGTGAA 
CGAGTGCAGT 
GAACTATCCA 
6ACACTGCAG 
CAAAASACTG 
ACTCATCTTC 
CAAGCTGGGC 
GAGCAACGGC 
CCTGCTTCTC 
ACCACCACCA 
GAAATTAGAA 
TTTGGGGGGA 
AATGGAGTTT 



31 

I 

GCCGTCCTCC 
GGAATATTAO 
0GG6TTCA0C 
GGGGGGGGTG 
GTCT6TCGGT 
TTCCTTGTAC 
GAC3UIC8TGA 
GTCA0CC9GG 
TCCCTGGATC 
CAGAACGTGG 
CCAAAGACCT 
TCAGATA3CT 
CCA6A6CCTA 
GAOQAATACT 
GCCTCCAAUG 
CCATACATTT 
TGTGAAOGCr 
ATTGAAGGAA 
TTCAATGTCT 
CACACCAATG 
ACX?TOGAGGA 
AAATTTTGAT 
ACACAACA6C 
GAAACACAGC 
AAAGAGTTTT 
TCTTTTCCCA 



41 

I 

GTGGAACOGG 
ACTCGGAGGA 
GCTCAGTCCC 
TGCCGTGCGG 
ACCTGTTCCT 
CCACAQQAGT 
CQ8T00GGCA 
TGGCCTGGCT 
CT0606T6GT 
ATGTGTATGA 
CTAGGGTCCA 
CGATTAAIOA 
CG6T7ACTTG 
TGGAAATTCA 
AOGTGGCOGC 
CAGAA6CCAA 
CAGCAGTCOC 
AGAAAGtSGGT 
CT6AACAT6A 
CCAGCATCAT 
GG6CAG6CTG 
GT6A6TG0CA 
AATGGCAACA 
CTCATGGGAC 
AAAAAAGAAA 
AA06GGAAGA 



51 
I 

TTTTCOGAGG 
0TCTG060GC 
C60QCTC6CT 
CTGCOGGAGT 
GCCCTGGAAO 
GCCCGTGCGC 
GG6GGA6AGC 
AAA0C6CAGC 
CCTTCTGAGC 
CQAGGGCCCT 
0CTCATTST6 
AOGQAACAAT 
QAGACACATC 
QG6CATCACC 
GCCC6T6GTA 
G6GTACA66T 
CTCAGCAGAA 
GAAAGTOGAA 
CTAT66SAAC 
GCTAITTGGT 
CGTCTQ6CTG 
CTTCCCCACC 
CCGACAGCAA 
AGAAATTTGA 
TTGAAAATTG 
ACACASCACA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



430 



5 

10 
15 



WO 02/086443 

OOOQGCTTGG AOOCACTGCA AGCTGCHTOQ TGCAACCTCT TTGGTGCCAG T8TQQQCAAG 
GGCTCAGCCT CTCTQCCCAC AGACTGCCOC OlOGTGGAAC ATTCTGGAGC TGGCCATCCC 
AAATTCAATC AGTCCATASA GAOGAACAGA ATGAGACCTT C0G6CCCAA6 OSTOGOGCTT 
CCGOCCCAAQ OGTGGCGCTG CGGGCACTTT GGTAGACTGT GCCACCACGG CGTGTGTTGT 
GAAAC6TGAA ATAAAAAGAG CAAAAAAAAA AAAAAAAAA 

Seq ID NO: 637 Protein sequence 
Protein Accession 8: NP_057606.1 

1 21 21 31 41 51 

I I I I I I 

KGVOGYLPLP WKCLWVSIjR LLFLVPTGVP VRSGDATFPK AMDMVTVRQG ESATLRCTID 
NRVTRVAWLN RSTILYAGND KWCLDPRVVL 1*SNTQTQYSI EIQNVDVYDB GPVTCSVQTD 
NHPKTSRVHL IVQVSPKIVE ISSDISINBG NNISLTCIAT GRPEPTVTWR HISPKAVGFV 
SEDBYLBIQO ITRSQSGDYB CSAS20VAAP WRRVKVTVN YPPYISEAKG TGVFVGQKG7 
LQCBASAVPS AEFQWYKDDK RLIBGKRGVK VE3IRPFLSKL IPFKVS&BDY GNYTCVASHK 
USanOiSimi PGPGAVSEVS NGTSRRAGCV VfLLPLLVI<Hb UiKF 



PCT/US02/12476 



1620 
1680 
1740 
1800 



60 
120 
180 
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300 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID MOt 638 DHA sequence 

Nucleic Acid Accession #: NM_012261.1 

Coding sequence* 203.. 1045 



1 

I 

GATTTGCTCT 
ACAGAATA06 
CACTCCAGOO 
GCTCATT06G 
ACTTOGAGTT 
GGAAAATCTC 
TGQGAOGAOS 
OGOCAGCAAC 
T6AQ6T6AAG 
0GCATAT6CA 
GGGGACTTGG 
CAAAGACX3CA 
CACCCXX3GCT 
TGATC06CA0 
TATCTCAGAT 
GGAAGAAACC 
CGCGATTTAC 
ATCXXaWSTAT 
CCAACTGGAT 
CATA6CTACA 
AACCCACGGA 
AT6CTGGGGA 
TGACTCXCCA 
TTOAAAACAT 
TOCTCCCTTO 
TCATGCTCCC 
GTTTAGTGAT 
AAAACGACTA 
GGQGGAOCTO 
TTCTCTSGC 



11 

1 

GCCAGCAGCT 
CGCTCCCTCC 
60GAGTTTGA 
GGCACTGCGA 
CTCCT6ATGT 
TCAGGCCTTT 
TGTCTCATGG 
TAaJTAGArC 
GGCO G CTOTG 
CTCAAAAT6C 
AGGCTGAGCA 
GTCAGrTGCTG 
GGGAAGTCCT 
AA6AG0GTCA 
TTTGTCTTCA 
TTGCCCCTGA 
CACGTCCACC 
AAGCACATG6 
CAOGTAOAAC 
ATCAAACAG6 
AGGGGGAGAC 
GGAGGGGAGG 
AAGAGCAATA 
GCTTCTTTGA 
GACACAGCTO 
TGCAGCAAGA 
TGTCTTGGGA 
ATGTAACTAT 
AAGAATCAAT 



21 
I 

GTOG6TG006 
CTCCCCCTTC 
GGGATTCCCT 
GTAT6GATCT 
TGTTCCATAC 
CCACTAAOCC 
CAGAGTTT6C 
TX3ATCACAGA 
GCCACAGGCA 
TCTTTGTAAA 
AAGTGCAGTT 
GGAAGCACAC 
ATGAGTGTCA 
CCATGATCCT 
GTGAAGA6CA 
TTTTGGGGCT 
ACAAAATGAC 
GCTAGAGGCC 
AACAAAAGCA 
CXTO30TATC 
TCTTTOGGAT 
AGGGTCTCAO 
AATGOCACTT 
GGAGGAAACC 
GCTTATCCTA 
CXXrCTGAAAG 
ATGTTTCACT 
GCAGAGTTGT 
CTGTGT8AGT 



31 
1 

CGCTOSACAC 
TCTGTCCCXX: 
CTCTGGOGGC 
CCAAGGAAGA 
AATGGCTCAA 
T6AAAAAGAT 
AGCCAAATTT 
ACAGGCCGAT 
GTGGGA6CT6 
GGAAAGCCAC 
TGTCTAOQAC 
AGCCAACTOG 
AGCTCAACAA 
GTCTGOGGTC 
TAAATQCOCA 
CATCTTGGGC 
TGCCAACCAG 
GTTAGGCAG6 
CT7TTCCATC 
T6A6GCTTGC 
TTGTAGGGTG 
ACAOCTTTCQ 
GGAGCTGTAT 
CCTTTAGGTT 
TACAGTTGTC 
TGATTCATGC 
GCTACCCGCA 
TTGGACTTCT 
CTGTTTTTCA 



TCCAGCGACT 
TCCTGTGCCA 
AAAT6AAATA 



51 

I 

CTAGGOGCTC 
CACCCOSGCC 
GCACAGCCGG 
GCATCGACAG 
AACAAGAAGT 
TGCGGGAAAA 
ATGATGTGTG 
0006GGGAGC 
0GGT6GATCG 
AGG6ACCTGA 
AAACCCACTT 
CTGCCTTGGT 
TGGCC TCTAG 
CTTTT6ACAT 
GGGAGCAACT 
TGGTAACACT 
CTCGGGACAG 
TCCTGCTCCC 
GATACACCAA 
TCCATOCTTA 
TATTCTCTCC 
GGCTTGGCTT 
AGTTTAGGGA 
TGQGGTGCT7 
GAATACAACC 
CATTCTGCAT 
GCAGCACCAG 
GGTCCAAGTC 
AAACACACTA 



Seq ID KO; 639 Protein sequence 
Protein Accession 8t KP_036393.1 

1 11 21 31 41 51 

I I I 1 I I 

MDLQGRGVPS ZDRLRVItLML FRTKAQINAB QEVENLS6L5 TTSPEKDIFW REMGTTCLMA 
EFAARFIVPY DVWAfiKYVDL ITBQADIALT RGASVKGROG HSQSELQVFN VDRAYALKML 
FVKESHNMSR GPEATHRLSR VQFVYDSSEK THFKDAVSAO KRTANSHHLS ALVTPAGKSY 
EOQAQQTISL A5SDFQRTVT MlbSAVHIQP FDIISDFVFS EEEQCCPVDER BQLBETLPLI 
LGLILGLVIM VTLAIYHVKB XMTANQVQIP RDRSQYKHM6 

Seq ID NO: 640 DHA sequence 

Nucleic Acid Accession #t NM_002993.1 

Coding sequence: 64.. 408 



1 
I 

QGCACQAGCC 
ACTATGA6CC 
GCGCTGCTOQ 
GTCTCTGCTG 
CCCAAAAOGA 
GT6GTAGCCT 
AAGAAAGTCA 
ACCATGCATC 
CAGTAAGAAT 
GAAGAGTGTG 
CTAATATAGT 
CAATTGACCA 
TGAA6ATAAC 
ATTTOGTATG 
ACTCACTCTT 



11 
I 

AGTCTC06CQ 
TCCQ6TCCAG 
CGCTGCTGCT 
TGCTGACAGA 
TTG6TAAACT 
CCCTGAA6AA 
TCCAGAAAAT 
ATAAAATTGC 
AAGAAGGAA6 
GGGGAAAGCC 
ATTTCCACTA 
TATTGT6AGC 
TATTGTATTT 
GAAATAATGT 
CTCATAAAAT 



21 
I 

OCTCCAGOCA 
COGCGOGGCC 
CCT6CTGA0G 
GCTGCGTTGC 
GCAGGTGfTTC 
COQg AAGCAA 
TTTGGACAGT 
CCAGTCTTCA 
GGTTGGTTTT 
TACGCTTCTC 
TTTACTGTTA 
AAAGAATCAC 
CTATCATACA 
TTTATTAGTG 
AGGAAATATT 



31 
I 

GCTCAGQAAC 
OGTGTCC0G6 
CCGC06GGGC 
ACTTGTTTAC 
GCC6CAGGCC 

OGAAACAAGA 
GOGGAGCAGT 
TTTCCATTTT 
CCTGAAGTTT 
TmACCTGA 
T66TTATTAG 
TTCCTTAAAG 
TGCTGTTGAG 
TTAGTTCTGT 



41 
I 

CC606AACCC 
GTCCTTC6G6 
CCCTCGCCAQ 
GCGTTACGCT 
06CAGTGCTC 
ACCGQGAAQC 
AAAACTCSlSr 
TTTCTGGAGA 
CTACATGGAT 
ACAGCTCAGC 
TAAGTTATTO 
TCTTTCAATC 
TCTTACOGAA 
6GAGGTATOC 
TTTCTTGGQO 



51 
I 

TCTCTTGACC 

OGCTGGTCCT 
6AGAGTAAAC 
CAAG6TGGAA 
CCCTTTTCIA 
AACAAAAAA6 
TCCCT6GACC 
TCCCTACTTT 
TAATGAAGTA 
AACOCTTTGO 
AATATTGAAT 
AAGGCTGTGG 
TCTTGITCTT 
AATAT6TTAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



431 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

TCTTTACCCT AGGATGCTAT TTAAGTTGTA CTGTATTAGA ACACT6G6T6 TCTCftTACOG 
TTATCT6T6C AGAATATATT TCCTTATTCA GARriTCTAA AAATTTAAST TCTGTAA666 
CTAATATATT CTCTTCCTAT GGTTTTAGAT GTTTGATGTC TTCTTAGTAT GGCATAATGT 
CATCATTTAC TCATTAAACT TTGATTTTGT ATGCTATTTT TTCACTATAG GATGACTATA 
ATTCTGGTCA CTAAATATAC ACTTTAGATA 6ATGAAGAAG CCCAAAAACA GATAAATTCC 
TGATTGCTAA TTTACATAGA AATGTATTCT CTTGGTTTTT TAAATAAAAG CAAAATTAAC 
AATGATCTGT GCTCTGCAAA GTT7T6AAAA TATATTTCSkA CAATTTGAAT ATAAATTCAT 
CATTTA6TCC TCAAAATATA TACAGCATTO CTAAGATTTT CAGATATCXA TrGTGGATCT 
TTTAAAGGTT TTGACXZATTT TGTTATGAGG AATTATACAT GTATCACATT CACTATATTA 
AAATTGCACT TrTATTTTTT CCTGTOTCTC ATGTTGGTTT TTCGTACTTG TATTGTCATT 
TGGAGAAACA ATAAAA6ATT TCTAAACCAA AAAAAAAAAA AAAAAAA 

Seq ZD U0< 641 Protein sequence 
Protein Accession i: NP_002 984.1 

1 11 21 31 41 51 

I I I I 1 I 

MSIiPSSRAAR VFGP86SLCA LLAIiXiXiLLTP POPIASAGPV SAVLTGXACT CUEVTLRVISP 
KTIGKLQVFP AGPQCSKVBV VASLKlKiKQV CLDPEAPFLK KVIQKILDSG NKKN 

Seq ID NO: 642 DNA sequence 

Nucleic Acid Accession 8: NM_013271.1 

Coding sequence t 27 . . 80 9 ' 



PCT/US02/12476 



1 
1 

TCCGGAGCCA 
CaSGGGGCgr 
TCTGCGOGCG 
A6ACTGGGGC 
TGCAGGAGCT 
GGGC06AG6C 
TCIGGGGCGC 
CTGCA6G6CA 
CCCAGCTTGT 
ACGACGGCCC 
CCGAGCTGTT 
TOQCAGCCrc 
CTGAG6G06T 
TGCCTGCACG 
CAGAAGTGCC 
TTACCC0G6C 
GATCTGAGC 



11 
1 

GGCTCGCTGG 
OGGCCTTTTG 
GCCGGTAAAG 
TCCTOGCOGC 
GGCG06G60G 
GCAG6AGGCT 
C0CC06CAAC 
6CTCGCTG0C 
CCCCGOGCCC 
OGCGGGCCCG 
GAGGTACTTG 
GGGC06CCTC 
GCTGGG6GCG 
COGCCTCTTG 
CCOSCCATCC 
CAGCCAGCCC 



21 
1 

GGCAGCATGG 
GTGCTGCTGC 
GAACCCCGOG 
TTC06GC6GT 
CTGGOGCATC 
GAGGATCA6C 
TCTGATCOGG 
6CTCTGCTCC 
GTCCCOGCOO 
GATGCTGAGG 
CIGGGACGGA 
0GCCGT6C0G 
CTGCTGOGTG 
CCACCCTGAG 
CGCCACCAGG 
TCTCACCCGA 



31 
1 

06GGGTOGCC 
TGCTOGGCCT 
GCXTTAAGCGC 
CAGTGCCCCQ 
TGCTGGAGGC 
A6G0G0GGGT 
CTCTG GG CCT 
G06CC0GCCT 
OSG06CTCGG 
A6GCAGGCGA 
TTCTTGCX3GG 
CCGACCAOQA 
TGAAACGOCT 
CACTGCCOQG 
ACTTCTCCCC 
GGATCCCTAC 



41 

1 

GCTGCTCTGO 
GTTTOGGCCG 
AGOGTCTOCG 
AGGTGAGGC6 
CGAACGTCAG 
CX^TGGCGCRG 
G6A0GA0GAC 
TGACCCTGCC 
ACCCOGGCCC 
CGAGACACGC 
AAGC6CQGAC 
TGTGGGCTCT 
AGA6ACCC0G 
ATCCOGTGCA 
GCCAGCAOGT 
CCCCTGGCCX: 



1 
I 

CCCAGAGCC6 
CTQCCGACTT 
GTTGQCCTCC 
TCCCCTCGAC 
TAGGGTGGTT 
CTAAGCTGAT 
TGTCCCGGA6 
TG6C0QT0QA 
GGCOGTAGQG 
CCGASCCGOS 
GGCCCCGAGG 
GGGGGGGGCT 
TCTGCCTGCA 
CACTTGTTCT 

cctgtgccao 
tttcaggtgg 

GCTCAGTT6A 
TTAATACCCA 
attttatgct 
ATGTCTCAGC 
CTAGAAAAAT 
AAACAGTTTC 
ACAATTTAGA 
TCACTGAGTT 
AAGGA6GTTT 
AAGAGQCTAA 



11 

I 

CCTCCCCCTQ 
OTCTTTGCOC 
CTGCCCACCT 
CTCGC0GGC6 
TCCCCCCX^G 
TTATGCAGCA 
CAGGCTGCGK3 
AGGAQGTGCT 
OCCCTGAGAT 
GGGTCOGCCT 
TCGCCCGGGA 
GTTTTGCATT 
AAACGACCOQ 
TGGACTGGGC 
GTGCCTTGC6 
ATCAAGAAGT 
TTCAATAGAA 
GGTGACACCA 
GAAA6TTCAT 
ATCAATGCAC 
GGCATTTTTC 
ACaVTACATT 
CTGCATGCCT 
TGAGAAAGCA 
TGA06CCATG 
AAGATTGCT6 



21 
I 

TTGCTGGCAT 
<5CTGCTG0QC 
6T86AA6CAA 
TACOCTCCCA 
CTTCGGGCTT 
GAAGCGCCAC 
A6C0CTTGCA 
TCT0G06GAG 
GCCGAGCGGT 
GCTAGGCCT6 
GGCOGAGCCC 
ATGTQOOGCT 
CGAGGTCC06 
CAAGGTGAAG 
CTGGGTCCAG 
GAAC6TTGT6 
TACOCATCTG 
6GAGAAGTGT 
CCTCTGAAGA 
AATAATATAG 
TC0C3GTGACT 
A6CATCCACC 
rcCCATGGAT 
6TTCATA6AC 
CTTCAGGCAG 
CTG6TGATGA 



31 
I 

CCC96AQCTTC 
ASAjOQQGGCT 

CAGATCCAGC 
TGTTTGGGTT 
CGGCTGGAGA 
GAGCCCTCTG 
ACC6CGGGAC 
6CCCGGGCCC 
CGGAAAAOGT 
60GTC0GGAA 
OGGCOCTGGC 
CCTOGTTCCT 
ACAATAGATG 
AATGTGGATG 
ATATT6TTTC 
TGCAIGTTAT 
CTATCCASCT 
AATATOCTGT 
AAAAATTAAA 
TTOBTCTTGG 
OOGAAAOGAT 
ACATCCAT6T 
A6AAC3VTCTC 
CTG T CTGTG A 
CAGATCAGAC 



41 

I 

CTCCCTTGOC 
GCAAAGCTGC 
TOATSOGCCA 
ATCACCCAGT 
TGATTGTGTT 
GAAACAAAA6 
TCCAGT080C 
CG6C06T6CC 
GCTTACCTGC 
CCTAGCGACA 
GGCAGCCAGO 
TTTTTTTACC 
CTGGGCAGCC 
TGCATCTTCA 
GTGTGTTCAA 
CAATTTAATA 
AATACCCACT 
G0GTCCAG6A 
GGATCTTTAT 
TTCCGTTC^ 
ATTTGGCTOl 
TCftTA ATCAA 
6CTGTCTTTG 
TGGAAACATA 
AAGTCaXATC 
6TCTCATCTC 



960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



51 
I 

GGGC06C6GG 
CXXXXCGCGC 
CCCTTGGCTG 
GCGGGGGCX3G 
GAGCGGGCGC 
CTGCTGCGOG 
CCCXSACGCGC 
GCCCTAGCA6 
CQG6TCTAQG 
GACGTGGACC 
TCCGAGGGGG 
GAGCTGCCCC 
GOGCCCCAGG 
CCXrTGGGACC 
CCAGAGCAAC 
ACAATAACAT 



Seq ZD NO: 643 Protein sequence 
Protein Accession li NP_037403.1 

1 11 21 31 41 51 

I I I t I I 

MAGSPLLWGP RAGGVGLLVL LLLGLFRPPP ALCARPVKEP RGLSAASPPL AETGAPRRFR 
RSVPRGEAAG AVQELARALA HIiI£AERQER ARAEAQSAED QQARVLAQLL RVWGAPRNSD 
FALGXiDDDPD APAAQLARAL URARLDPhhL AAQLVPAPVP AAALRPRPPV YDDGPAGPDA 
BEA6DETPDV DPELLRYIiLa RZLAGSADSE GVAAPRRLRR AADBDVGSSL PPBGVLGALL 
RVKRLETPAP QVPARRLLPP 

Seq ID NO: 644 DNA sequence 
Nucleic Acid Accession #; NM_002214 
Goding sequences 681.. 2990 ~ 



51 
I 

AGCCAG6AC6 
AACTAATGST 
CAGACTTTTT 
GAATGTACAT 
TGQCTCTTOG 
CTCTTTTCTT 
GC0G6GCCCT 
GAGCCGGQAS 
ACOGCTTGCT 
CTCGCC03CG 
OGGCGGGCGC 
GCTGCA7TT6 
TGGGTGTTTT 
AATGCAGCAT 
GAGGATTTCA 
AGCAAAGGCT 
6AAAATGAAA 
GOO GAAGCTA 
TATCTTGTTG 
AACGATTTAT 
TACGTTGATA 
TGCAGTGACT 
ACAGAGAACA 
GATACACCAG 
GGATGGCGAA 
GCTCTTGATA 



60 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
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GCAAATTGGC AGGCftTAGTQ GTGCCaVATQ AOQGAMVCTG TCATCTGAAA AACftAOC?rCT 1620 

AOBTCAAATC QACAACCHtG GAACAOCCCT CACXAG60CA ACTTTCAGAG AAATTAATAG 1660 

ACAACAACAT TAATGTCATC TTTGCAGTTC AAGGAAAACA ATTTCATTGG TATAAGGATC 1740 

TTCTACCCCT CTTGCCAGGC ACCATTGCTG GTGAAATAGA ATC3^GC5CT GCaUUVCCTCA • 1800 

ATAATTTCGT AGTGGAACCC TATCA6AAGC TCATTTCAGA AGTGAAAGTT CASGTQ6AAA XB60 

ACXAGGTACA AGGCATCTAT TTTAACATTA GGOCCATCTG TCCAGATQGG TCCAGAAA6C 1920 

CAG6CATG6A AG6ATGCA6A AA06TGA06A 6CAAT6ATGA AGTTCTTTTC AATGTAACA6 1980 

TTACAATGAA AAAATGTQAT GTCACAGGAO GAAAAAACTA TGCAATAATC AAACCTATTG 2040 

GTTTTAATGA AACCGCTAAA ATTCATATAC ACAGAAACTG CAGCTGTCM TGTGAGGACA 2100 

ACAGAGGACC TAAAG6AAA6 TGT6TA6AT6 AAACTTTTCT AGATTCCAAG TGTTTCCAGT 2160 

GTGA3GA6AA TAAATGTCAT TTTGAT6AAG ATCA8TTTTC TTCT6AGAGT T6CAAGTCAC 2220 

ACAAGGATCA GCCTGTTTGC AGTGGTOBAG GAGTTTOT G T TTGTCOGAAA TGTTCATOTC 2280 

ACAAAATTAA GCTTGGAAAA GTGTATGGAA AATACTGTGA AAAGGATGAC TTTTCTTGTC 2340 

CATATCACCA TGGAAATCTG TGTGCTGGGC ATGGAGAGTG TQAAGCAGGC AGATGCCAAT 2400 

GCTTCAGTGG CTGGGAAGGT GATCGAT6CC AGTGCCCTTC AGCAGCAGCC CAGCACT6TG 2460 

TCAATTCAAA GQGCCAAGT6 TGCA6TGCAA GA6GCACGT0 TGTGTGTGGA AGffTOTGAOT 2520 

GCACOSATCC CAG6A6CATC OGCCGCTTCT GTGAACACTG GCCCACCTGT TATACA6CCT 2580 

GCAAGGAAAA CTGGAATTGT ATGCAATGCC TTCACCCTCA CAATTTGTCT CAGGCTATAC 2640 

TTGATCAGTG CAAAACCTCA TGTGCTCTCA TOGAACAACA GCATTArGTC GACCAAACTT 2700 

CAGAATGTTT CTCCABCCCA AGCTACTTGA GAATATTTT7 CATCATTTTC ATAGTTACAT 2760 

TCTT6ATT6G GTTGCTTAAA GTCCTGATCA TTAGACAGGT QATACTACAA TGGAATAGTA 2820 

ATAAAATTAA GTCCTCATCA GATTACA6A6 TQTCAGCCTC AAAAAAOGAT AAGTTGATTC 2860 

TGCAAAGTGT TTGCACAAGA GCAGTCACCT ACOSAOGTGA GAAGCCTGAA GAAATAAAAA 2940 

TGGATATCAG CAAATTAAAT GCTCATGAAA CTTTCAGGTG CAACTTCTAA AAAAAGATTT 3000 

TTAAACACTT AATGGGAAAC TGGAATTGTT AATAATTGCT CCTAAAGATT ATAATTTTAA 3060 

AAGTCACAGG AGGAGACAAA TT6CTCACG6 TCATGCCAGT TGCTGGTTGT ACACTGQAAC 3120 

GAAGACTGAC AAGTATOCTC ATCATGATGT tSACTCACATA GCTGCT6ACT TTTTCAGAGA 3180 

AAAATGTGTC TTACTACTGT TTGAGACTAG TGTCGTTGTA GCACTTTACT GTAATATATA 3240 

ACTTATTTAG ATCAGCATAO AATGTAGATC CTCTGAAOAG CACTGATTAC ACTTTACAGG 3300 

TACCTGTTAT CCCTACGCTT CXCAGAQAGA ACAATGCTGT GAGAGAGTTT AGCATTGTGT 3360 

CACTACAAG6 GTACAGTAAT CCCTGCACTO GACATGTGA5 GAAAAAAATA ATCTGGCAAG 3420 

TATATTCTAA GGTTGCCAAA CACTTCAACA GTTGGTGGTT GAATAGACAA GAACAOCTAG 3480 

AT6AATAAAT GATTCGTGTT TCACTCTTTC AAGAGGTGAA CAGATACAAC CTTAATCTTA 3540 

AAAGATTATT 6CTTTTTAAA GT6TGTAGTT TTATGCATGT GTGTTTATGO TTTGCTTATT 3600 

TTTGCAAGAT GGATACTAAT TCCAQCATTC TCTCCTCTTT GCCTTTATGT TTTGTTTTCT 3660 

TTTTTACAGG ATAAGTTTAT GTATGTCACA GATGACTGGA TTAATTAAGT GCTAAGTTAC 3720 

TACTGCCATA AAAAACTAAT AATACAATGT CACTTTATCA GAATACTAGT TTTAAAA6CT 3780 
GAATGTTAA 

Seq ID KOi 645 Protein sequence 
Protein Acceesion #s NP^002205 

1 11 21 31 41 51 

I t I I I t 

KGGSAIAFPT AAFVCLQNDR RGPASFLHAA WVFSLVIiGI/3 QGEDNRCASS NAASCARCLA 60 

LGPEOGWCVQ EDPISGGSRS ERCDIVSNLI SKGCSVDSIB YPSVHVIIPT ENBINTQVTP 120 

GEVSIQIiRPG AEANFMLKVH PLiKKWVDLY YLVDVSASMH NMXEKLKSVG HDLSSKMAFF 180 

SRDFRLOFQS YVDKTVSPYI SIBPERIHMQ CSDYNLDCMP PBGYIBVLSL TENITEFEXA 240 

VHRQKIS6NZ DTPSGGPDAM LQAAVCBSHI GWRKEAKRLL LVHTDQTSHL AU3SKIAGZV 300 

VPNDOJCHLK NNVYVKSTTM EHPSLGQLSE KLIEHWINVI PAVQGKQFHW YKDLLPLLPG 360 

TIAGEIESKA ANLNNXjWEA YQKLISEVKV QVEKQVQGIY FNITAICPDO SRKPQ4EGCR 420 

NVTStlDCVItF NVTVTMXKCD VTGG»TYAZX KPI6FNBTAK ZHIB5NCS0Q CEDXJRGPKGK 480 

CVDETPLDSK CFQCDENXCH FDEDQFSSES CKSHKDQPVC 8GRGVCVGGK CSCHKIKLGK 540 

VYGKYCBKZ30 PSCPYHBGNL CAGHGECBAG RCQCPSGHEG DROQCPSAAA (^CVNSKGQV 600 

CSGRGTCVOG RCECTDPRSI GRPCEHCPTC YTAGKENWHC HQCLBPBNLS QAILDQCRTS 660 

aOiMBQQHYV DQTSECPSSP SYIiRIFPIIP IVTFLIGLLK VLIIRQVILQ HNSNKIKSSS 720 
DYRVSASKKD KLILQSVCTR AVTYRSEKPE EIEa4DISKIiN AHETFRCNP 

Seq ID NO: 646 DHA sequence 

Nucleic Acid Accession fit MM_0033ie.l 

Coding sequencer 1..2574 ' 

1 11 21 31 41 51 

f ) I I I I 

ATGGAATC06 AGGATTTAAG TGGCAGAGAA TTGACAATTG ATTCCATAAT GAACAAAGTG 60 

AGAGACATTA AAAATAAGTT TAAAAATGAA GACCTTACTO ATGAACTAAO CTTGAATAAA 120 

ATTTCTGCT6 ATACTACAGA TAACTOQGGA ACTGTTAACC AAATTATGAT GAT66CAAAC 180 

AACOCAGAGG ACTGGTTGAG TTT6TT6CTC AAACTAQAGA AAAACAGTGT TCOGCTAAGT 240 

QAT6CTCTTT TAAATAAATT GATTGGTOST TACM3TCAA6 CAATT6AA6C GCTTCCCCCA 300 

QATAAATATG GCCAAAATGA GAGTTTTGCT AGAATTCAAG TGAGATTTGC TGAATTAAAA 360 

GCTATTCAAG AGCCAGATGA TGCAOGTGAC TACTTTCAAA TGGCCAGA6C AAACTGCAAG 420 

AAATTT6CTT TTGTTCATAT ATCTTTTGCA CAATTTGAAC TGTCACAAGG TAATGTCAAA 480 

AAAAGTAAAC AACTTCTTCA AAAAGCTGTA GAAOQTGGAS CAGTACCACT AGAAATGCT6 540 

6AAATTGGCC TGOGGAATTT AAACCTOCAA AAAAAGCA6C TGCTTTCAGA GGAGGAAAAG 600 

AAGAATTTAT CAGCATCTAC G6TATTAACT GCCCAAGAAT CATrTTCCGG TTCACTTGGG 660 

CATTTACAGA ATAGGAACAA CAGTPGTGAT TCCAGAGGAC AGACTACTAA AGCCAOGTTT 720 

TTATATGGAG AGAACATGCC ACCACAAGAT GCA6AAATA0 GTTA008GAA TTCATTGAGA 780 

CAAACTAACA AAACTAAACA GTCAT600CA TTTGOAAOAG TCOCAGTTAA GCTTCTAAAT 840 

AGOXAGATT GT6ATGTGAA GACAGA'TGAT TCAGTT8TAC CTTGTTTTAT GAAAAGACAA 900 

ACCTCTAGAT CAGAATGCCG AGATTTGGTT GTGCCTGGAT CTAAACCAAG TGGAAATGAT 960 

TCCTGTOAAT TAAGAAATTT AAAGTCTGTT CAAAATAGTC ATTTCAAGGA ACCTCTQGTG 1020 

TCAGA1GAAA AGAGTTCTGA ACTTATTATT ACTGATTCAA TAACCCT6AA GAATAAAA06 1080 

GAATCAAGTC TTCTAGCTAA ATTAGAAGAA ACXAAAGAOT ATCAAGAAC3C AGAGGTTOCA 1140 

GAGAGTAACC A6AAACAGTG GCAATCTAAG AQAAAOTCAG A6TGTATIAA CCAGAATOCT 1200 

GCPGCATCTT CAAATCACTG GCAGATTCCG GAGTTAGCCC GAAAAGTTAA TACAGAGCAO 1260 

AAACATACCA CTTTTGAGCA AOCrcrrCTTT TCAGTTTCAA AACAGTCACC ACCAATATCA 1320 

ACATCTAAAT GGTTTGACCC AAAATCTATT T6TAAQACAC CAA6CA6CAA XACC7TGGAT 1380 
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G^ITTACAIXSA GCTGTTTEA6 AACTOCAGTT GTAAAGAMG ACTTTOCACC T6CTTGTCAG 1440 

TTGTCAACAC CTTATGGCCA ACCTQCCTGT TTCCAGCAGC AACAGCATCA AATACTT60C 1500 

ACTCCACTTC AAAATTTACA GGTTTTAGCA TCTTCTTCAG CAAATGAATG CATTTOGGTT 1560 

AAAGGAAiGAA TTTATTCCAT TTTAAA6CAG ATAGGAACTTG GAGGTTCAAG CAAGGTA7TT 1620 

CAGGT6TTAA ATGAAAAGAA ACAGATATAT GCTATAAAAT ATGTGAACTT AGAAGAAGCA 16B0 

GATAACCAAA CTCTTGATA6 TTA00G6AAC GAAATAGC7T ATTTGAATAA ACTACAACAA X740 

CACAGTGATA AGATCATCOG ACTTTAT6AT TATGAAATO^ CG6ACCAGTA CATCTACATG IB 00 

GTAATGGAGT GTGGAAATAT T6ATCTTAAT ACTTG&CTTA AAAAGAAAAA ATCCATTGAT 1660 

CCATGGGAAC GCAAGAGTTA CTG6AAAAAT ATGTTA6AGG QU3TTCACAC AATCCATCAA 1920 

CATGGCATTG TTCACAGTGA TCTTAAACCA GCTAACTTTC TGATA6TTGA TGGAATGCTA 1960 

AAGCTAATTG ATTTIGGGAT TGCAAAOCAA ATGCAAOCAG ATACAACAAG T6TTGRAAA 2040 

GATTCTCAGG TTGGCACAGT TAATTATATG CCACXrAGAAG CAATCAAAGA TATGTCTTCC 2100 

TCCAGAGA6A ATGGGAAATC TAA6TCAAAG ATAAGCCCCA AAASTGATGT TTGGTCCTTA 2160 

6GATGTATTT TGTACTATAT GACTTACGGO AAAACACCAT TTGAGCA6AT AAT7AATCAG 2220 

ATTTCTAAAT TACATGCCAT AATTGATCCT AATCATGAAA TTGAATTTCC CGATATTOCA 2280 

GAGAAAGATC TTCAAGAT6T GTTAAAGTGT TGTTTAAAAA GGGACCCAAA ACAGAGGATA 2340 

TCCATTCCTG AGCTCCTGGC TCATCCCTAT GTTCAAATTC AAACTCATCC AGTTAACCAA 2400 

ATGGCCAAGG GAACCACTGA AGAAATGAAA TATGTTCTGG GCCAACTTGT TGGTCTQAAT 2460 

TCTCCTAACT CCATTTTGAA AGCTGCTAAA ACTTTATATG AACACTATAG TGGTGOTGAA 2520 
AGTCATAATT CTTCATCCXC CAAGACTTTT GAAAAAAAAA GGG6AAAAAA ATGA 

Seq ZD mot €47 Protein sequence 
Protein Accession fti HP_009309.1 

1 11 21 31 41 51 

! I i I I I 

MESEDIiSGRE LTZDSIMHKV RDIKNKFKNS DLTDSLSLMK ISADnnNSG TVKQIMMI>^ 60 

NPEDHLSUjL ICLEKHSVPLS DALUIXLIGR YSQAIEALPP DKYGCHaSSFA RIQVRFAQiK 120 

AIQEPDDARD YFQKARANCK KFAFVHISFA QFELSQGNVK KSKQLLQKAV ERGAVPLEML 180 

EIAXJaJhSUQ KKQhLSEBEK KKLSASTVIiT AQBSFSG5LG HLONSNNSCD SRGQ77KARF 240 

LY6BNMPPQD AEIGYRNSLR QTNKTKQSCP FGRVPVNLLM SPDCDVKTDD SWPCPMKRQ 300 

TSRSECRDLV VPGSKPSGND SCEI*RNIiKSV QNSEFKEPLV SDEKSSELZI TDSZTIiKNKT 360 

ESSIiZiAKLEE ^TKEYQBPEJVP BSNQKQWQSK RKSBCZNQNP AASSNBWQZP ELARKVNTBQ 420 

KBTTFEQFVF SVSKQSFPZS TSKWFDPRSZ CKTPSSNTW DYMSCPRTPV VKHDFPPAOQ 480 

IfSTPYGQPAC FQQQQRQZLA TPLQNDQVLA 5SSANECZSV KGRZYSZLKQ ZGSGGSSKVF 540 

QVIiMEKKQiy AZKYVNLEEA DNQTI4DSYBN EZAYIUKLQQ HSDKZIRLYD YBITDQYZYM 600 

VMBCXSNZDLN SWLKKKKSZD FHSRKSYWKN KLEAVHTZHQ HGZVHSDLKP AMFLZVDGML 660 

KLZDF6ZANQ MQPOTTSWK DSQVaTVNYM PPEAXXStMSS SRBIGKSKSK I8PKSDVHSL 720 

GCtLVYMTyO KTPFQQZZKQ ISKLRAZZDP HHEIBPPDZP BKDLQDVLKC CLKRDPKQRX 780 

SIPELLABpy VQZQTRPVNQ MAK0TTED4IC yVLGQLVGUi SPNSILKAAR TLYBHYSGGB 840 
SHN8SSSKTF ERKRGXK 

Seq ZD NOt 648 OKA sequence 
Nucleic Acid Accession #} NM.015507 
Coding sequexice: 241.. 1902 

1 11 21 31 41 51 

I I i I I I 

C0GCA6AGGA GOCTOGGCCA GGCTAGCCAG 6G080C0CCA GCCCCTCCCC AOOCCGOGAG 60 

06CCCCTGCC GCGGTGCCTG GCCTCCCCTC CCAGACTGCA GGGACacCAC CX3GGTAACTG 120 

OGAGTGGAGC GGAGGACCOG A6CG6CTGAG GAGAGAGGAG GCGGCGGCTT AGCTGCTAC6 180 

GGGTC06GCC GGGGCGCTCC CX3AQGGGGGC TCAGGAGGAG GAAGGAGGAC C06T606AGA 240 

ATQGCTCTGC CCTGQAOOCT TGOQCTOCCO CTGCTGCTCT CCTGGOTGGC AGGTG6TTTC 300 

GGGAACGOGG CCAGTGCAAG GCATCAOGOO TTGTTA6CAT 006CACGTCA GCCTGGG6TC 360 

TGTCACTATG GAACTAAACT GGCCTGCTGC TACGGCTGGA GAAGAAACAG CAAGGGAGTC 420 

TGTGAAGCTA CATOCGAACC TGGATGTAAG TTTGGTGAGT GCGTQGGACC AAACAAATGC 480 

A6ATCCTTTC CAGGATACAC 06G6AAAACC TGCAGTCAAO ATGTQAATOA OTGTGGAATG 540 

AAACOCOQOC CAT6CCAACA CAGATGTGTG AATACACAOQ GAAOCTACAA GrGCTTTTOC 600 

CTCAGTG6CC ACATGCTCAT GCCAGATGCT AOGTGT O T Q A ACTCTAGGAC ATGTGCCATG 660 

ATAAACTGTC AGTACAOCTQ TGAA6ACACA GAA6AAG6GC CACAGTGCCT GTGTCCATCX: 720 

TCAG6ACTCC GCCTGGCCOC AAATGGAAGA GACTGTCTAG ATATTGATGA ATGTGCCTCT 780 

GGTAAAGTCA TCTGTCCCr A CAAT0SAA6A TGTGTGAACA CATTIGGAAG CTACTACTGC 840 

AAATGTCACA TTGGTTTOSA ACTGCAATAT ATCAGTGGAC GATATGACTG TATAGATATA 900 

AATGAATGTA CTATG6ATAG CCA7ACGT6C AGCCACCATG CCAATTGCTT CAATACCCAA 960 

GG6T0CTTCA AGT6TAAATQ CAAGCAGGGA TATAAAGGCA ATGGACTTCG GTGTTCTGCT 1020 

ATC0CT6AAA ATTCTGTGAA GGAAGTCCTC AGAGCACCTG GTACCATCAA AGACAGAATC 1080 

AAGAAGTTGC TTGCTCACAA AAACAGCATQ AAAAAGAAGG CAAAAATTAA AAATGTTACC 1140 

CCAGAACCCA CCAGGACTCC TACCCXTAAO GIGAACTT6C AGCCCTTCAA CTATGAAGAG 1200 

ATAGTTTCCA GAGGCGGGAA CTCTCATQGA GGTAAAAAAG GGAATGAAGA GAAAATGAAA 1260 

QAOGGGCTTG AG6ATGAGAA AAGAGAAGAO AAAGCGCTGA AGAATGACAT AGAGGAGCGA 1320 

AGCCT6C3GAG GAGATGTGTT TTTCXTTAAG GTGAATGAAG CAGGTGAATT CGGCXTTGATT 13 BO 

CTGGTCCAAA GGAAAGCGCT AACTTCCAAA CTGGAACATA AAGATTTAAA TATCTCGGTT 1440 

GACTGCAGCT TCAATCATGQ GATCTGTGAC TGGAAAGAGG ATAGAGAASA TGATTITGAC 1500 

TGGAATOCTG CTGATOGAGA TAAT6CTATT GGCTTCTATA TGGCAGTTCC GGCCTTGGCA 1560 

GGTCACAAGA AAGACATTGO CXS3ATTGAAA CTTCTCCTAC CTGACCTGCA ACXXX3UJtf3C 1620 

AACTTCTOrr TGCTCTTTGA TTACX3G0CTG GCCGGAGACA AAGTCGGGAA ACTTCGAGTG 1680 

TTTGTGAAAA ACAGTAACAA T6CCCTGQCA TGGGAGAAGA CCAOGAGTGA GGATGAAAAG 1740 

T66AAGACAG 66AAAATTCA GTTGTATCAA GGAACTGATG CTACCAAAAG CATCATTTTT IBOO 

6AAGCAGAAC GTGGCAAGGG CAAAACXX36C GAAATOGCAG TGGATGGCGT Cl ' mcnvri ' 1860 

TCAGGCTTAT GTCCAGATAG CCTTTTATCT GTGGATGACT GAATGTTACT ATCTTTATAT 1920 

TTGACTTTGT AT8TCAGTTC CCTQGTTTTT TTGATATTGC ATCATAGGAC CTCTGGCATT 1980 

TTAGAATTAC TAGCTGAAAA ATTGTAATGT ACCAACAGAA ATATTATT6T AAGATGCCTT 2040 

TCTTGTATAA GATATGCCAA TATTTGCTTT AAATATCATA TCACTGTATC TTCTCA6TCA 2100 

TTTCTGAATC TTTCCACATT ATATTATAAA ATATGQAAAT GTCAGTTTAT CTCCCCTCCT 2160 

CAGTATATCT GATTTGTATA AGTAAGTTGA TQAGCTTCTC TCTACAACAT TTCTAGAAAA 2220 

TAGAAAAAAA AGCACA6A6A AATGTTTAAC TGTT7QACTC TTATGATACT TCTTGGAAAC 2280 

TA76ACATCA AAGATAGACT TTTGCCIAAG TGGCTTAGCT GGGTCTTTCA TA60CAAACT 2340 



434 



wo 02/086443 

TQTATATTTA AATTCTTTGT AftTAATAJkTA TCCAAATOVr CAAAAAAAAA AAAMVAAA 



Seq ID NO: 649 Protein sequence 
Protein Accession NP_056322 

1 11 21 31 41 51 

I I I I f I 

MPLPWSLALP LU.SWVAGGF CaiAASARHHG LLASARQPGV GHYGTKLACC YGWRRKSKGV 60 

CEATCEPGCK FGECV6PNKC RCFP6YTGKT CSQDVNGOGM KPRPOQBRCV NTSGSYKCPC 120 

LSGBNLHPOA TCVNSSTCAM ZKOQYSCBDT BB6PQCLCPS SGLRLAFSCR DCLDIDBCAS 180 

GKVICPYKRR CVNTFCSVYC KCHXGFELQY ISGRYDCIDI NECTMDSETC SHHANCFNTQ 240 

6SFKCKCKQG YKGNGLRCSA IPENSVKEVL RAPGTIKDRI KKLLABKKSM KKKAKZKI7VT 300 

PEPTRTPTPK VNLQPnnfEB rvSRGGKSKG GKK107SSKNK EGLBDEKREB KALKNDIEBR 360 

SLRGDVFFPK VNSAGEFGLI LVQRKALTSK LEHKDLNISV DCSmiGICD HKQDHEX30FD 420 

MMPADRIIKAI GFYMAVPALA GHKKDIGRLK LLLTOLQPQS NFCLLFDyRIi AGDKVGfELRV 480 

FVKNSNNALA WEKTTSEOEK HKTOKXQIiyQ GTDATKSZZP EAERGKSRTS EZAVDGVLLV 540 
SGLCPDSLLS VDD 

Seq ID NO: 650 DNA sequence 

Nucleic Acid Accession 0t NN_003506.1 

Coding sequencei 259. .2379 ~ 

1 IX 21 31 41 51 

I I I I i I 

GCAGCTCCAS TCCCOGAOGC AACCCCGGAO CCX3TCTCAGG TCCCTGGGGO GAA0Q6TGGG 60 

TTAGA06GGQ ACGGGAAGGG ACAGCGGCXTT TCXSAC06CCC CCOSAGTAAT TSACCCAGGA 120 

CTCATTTTCA GGAAAQCCTG AAAATGA6TA AAATAGTGAA ATGAGGAATT TGAACATTTT 180 

ATCTTTGGAT GGGGATCTTC TGAGGATGCA AAGAGTGATT CATCCAAGCC ATGTGGTAAA 240 

ATGAGGAATT T6AAGAAAAT GGAGATGTTT ACATTTTTGT TGACGTGTAT TTTTCTACOC 300 

C7CCTAAGA6 GGCACAGTCT CTTCACCTGT GAACCAATTA CTGTTCCCAG ATGTAT6AAA 360 

ATGGCCTACA ACATGAOGTT TTTCCCTAAT CTGATOOGTC ATTATGACCA GAGTATTGCC 420 

GOGGTGGAAA TGGAGCATTT TCTTCCTCTC GCAAATCTGG AATGTTCACX: AAACATTGAA 480 

ACTTTCCTCT GCAAAGCATT TGTACCAACC TGCATAGAAC AAATTCATGT GGTTCCACCT 540 

TGTOCTAAAC TTTGTGAGAA AGTATATTCT GATTQC31AAA AATTAATTGA CACTTTTGGO 600 

ATCCX3ATGGC CTGAGGA6CT TGAATGTGAC AGATTACAAT ACTGTGATGA GACTGTTCCT 660 

GTAACTTTTG ATCX:ACACAC AGAATTTCTT GGTCCTCAGA AGAAAACAOA ACAAGTCCAA 720 

AGA6ACATTG GATTTTGGTG TCCAA6GCAT CTTAA6ACTT CTGGQGGACA AGGATATAAO 780 

TTTCT6G6AA TT6ACCA6TG TG06CCTCCA T8000CAACA TGTATTTTAA AA6TGATGA6 840 

CTAGAQTTTG CAAAAAGTTT TATTGGAACA GTTTCAATAT TTTGTCTTTQ TGCAACTCTG 900 

TTCACATTCC TTACrTTTTT AATTGATGTT AGAAGATTCA GATACCCAGA GAGACCAATT 960 

ATATATTACT CT6TCTGTTA CAGCATTGTA TCTCTTATGT ACTTCATTGO ATTTTT6CT0 1020 

Q60QATAGCA CAGCCTGCAA TAAG6CAGAT GAGAAQCTAO AACTTGGTGA CACTGTTGTC 1080 

CTAGGCTCTC AAAATAAOGC TTGCAC08T7 TT0TTCAT8C 7TTTGTATTT TTTCACAAT6 1140 

GCTG6CACTQ TGTGGTGGGT GATTCTTACC ATTACTTGGT TCTTAGCTGC AGGAAGAAAA 1200 

TGGAGTTC5TG AA6CCATOGA GCAAAAAGCA GTGTGGTTTC ATGCTGTTGC ATGGGGAACA 1260 

CCAGGTTTCC TGACTGTTAT GCTTCTTGCT CTGAACAAAG TTGAAGQAGA CAACATTAGT 1320 

GGAGTTTGCT TTGTTQGCCT TTATGSAGCTQ GATQCTTCTC GCTACTTTGT ACTCTTGCCA 1380 

CT6TGCCTTT GTGTGTTTOT TGGQCTCTCT CTTCTTTTAfi CTGGCATTAT TTCCTTAAAT 1440 

CAT6TTG6AC AAGTCATACA ACATGATGGC 06GAACCAA6 AAAAACTAAA GAAATTTAT6 1500 

ATTCGAATTG GAGTCTTCAG OGGCTTGTAT CTTGT6CCAT TAGTGACACT TCTC3GGATGT 1560 

TACGTCTATG AGCAA6TGAA CAGGATTACC T66GAGATAA CTTGGGTCTC TGATCATTQT 1620 

CGTCAGTACC ATATCCCATG TCCTTATCAG G»AAAGCAA AAGCTOGACC AQAATTG6CT 1680 

TTATTTATGA TAAAATACCT GAT6ACATTA ATTGmGGCA TCTCTOCTGT CTTCTGGGTT 1740 

GGAAGCAAAA AGACATGCAC AGAATGGGCT GGGTTTTTTA AACGAAAT06 CAA6A6AGAT 1800 

CCAATCaOTO AAAGTCX5AAG AGTACTACAG GAATCATGTG AGTTTTTCTT AAAGCACAAT 1860 

TCTAAAGTTA AACACAAAAA GAA6CACTAT AAACCAAGTT CACACAAGCT GAAGGTCATT 1920 

TCCAAATGCA TGGGAACCAG CACAGGAGCT ACAGOUUVTC AT6GCACTTC TGCAGTAGCA 1980 

ATTACTA6CC ATGATTAOCT AGGACAAGAA ACTITGACAG AAATCCAAAC CTCACCAGAA 2040 

ACATCAATGA GAGAGGTGAA AGCGGAOGGA GCTAGCACCC CCA6GTTAAG AGAACA6GAC 2100 

TGTGGTGAAC CTGCCTOOCC AGCAGCATCC ATCTCC3«S^C TCTCTGGGGA ACAGGTOGAC 2160 

GGGAAGGGCC AOGCAGGCAG 76TATCTGAA AGTO0600GA GTGAAGGAAG GATTA6TCCA 2220 

AAOAGTOATA TTACTGACAC TOG<XTGGCA CAGAGCAACA ATTTGCAOGT COCCAGTTCT 2280 

TCAGAACCAA 6CA0CCTCAA AGGTTCCACA T C rCIWrm TTCACCCAGT TTCA66A6TG 2340 

AGAAAAGAGC AGGGAGGTOO TTGTCATTCA GATACTTGAA GAACATTTTC TCTCGTTACT 2400 

CAGAAGCAAA TTTGTGTTAC ACTGGAAGTQ ACCTATGCAC TGTTTTGTAA GAATCACTGT 2460 

TA08TTCTTC TTTTGCACTT AAAGTTGCAT TGCCTACTGT TATACTGGAA AAAATAQAGT 2520 

TCAASAATAA ZATGACTCAT TTCACACA AA G OTTAATG AC AACAATATAC CTSAAA ACAQ 2580 

AAATQTGCAG GTTAATAATA 77TTTTTAAT AGTGTGQGA6 GACAQAGTTA GA6GAATCTT 2640 

CCTTTTCTAT TTATGAAGAT TCTACTCTTG GTAAGAGTAT TTTAAQATGT ACTATGCTAT 2700 

TTTACCTTTT TGATATAAAA TCAAGATATT TCTTTGCTGA AGTATTTAAA rCTTATCCTT 2760 

GTATCTTTTT ATACATATTT GAAAATAAGC TTATATGTAT TTGAACTTTT TTGAAATCCT 2820 

AnCAASIAT TTTTATCATO CTATTGT6AT ATTTTAGCAC TTTQ6TAGCT TTTACACT8A 2880 

ATTTCTAAGA AAATT6TAAA ATACTCTTCT TTTATACTOT AAAAAAAGAT ATACXAAAAA 2940 

GTCTTATAAT AG6AATTTAA CTTTAAAAAC CCACTTATTG ATACCTTACC ATCTAAAATG 3000 

TGTGATTTTT ATAGTCTOGT TTTAGGAATT TCACAGATCT AAATTATGTA ACTGAAATAA 3060 

GGTGCTTACT CAAAGAGTQT CCACTATTGA TTGTATTATO CTGCTCACTG ATCCTTCTGC 3120 

ATATTTAAAA TAAAATGTCC TAAA06GTTA GTA6ACAAAA IGTTAGTCTT TTGZATATTA 3160 

GGOCAAGTGC AATTQACTTC CCTTTTTTAA TGTTTCATGA GCACCCATT6 ATTGTATTAT 3240 

AAOCACTTAC AGTTQCTTAT ATTTTTTGTT TTAACTTTTG TTTCTTAACA TTTAGAATAT 3300 
TACATTTTGT ATTATACAGT ACCTTTCTCA GACATTTrGT AG 

Seq ID HO: 651 Protein sequence 
Protein Accession S« NPJ)03497.i 

1 11 21 31 41 51 

I I ( i i I 
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MEKETFXtLTC XPLPLXiRGBS LFTCEPXTVP RCKKKAYNMr FFPHLKGEVD QSZAAVOIBH €0 

PLPLAMLBCS PHIETFLCKA FVPTCZEQIH WPPCRXL^ KVYSDCKRLI DTF6ZRHPBB 120 

LHCDRLQYO) ETVPVTFDPH TEPLGPQKKT EQVQRDIGPW CPHHLKTSGG QGYKFLGIDQ ISO 

CAPPCPNMYP KSDELEFAKS FIGTVSIFCL CATLFTFLTP LIDVRRFRYP ERPIIYYSVC 240 

YSIVSU1YFI GFLIGDSTAC KKKDJSKbSLG DTWLGSQNK ACTVX^FMIiLY PFTMAGTVHW 300 

VILTITNFLA AGRKHSCBAZ BQKKVWFHKV AHGTP6FLTV MLLAEiNKVBG ONISGVCFVG 360 

LYDLDASRyP VLIiPLCLCVF VGLSLLLAGI ISUIRVRQVI QHDGRNQEXL XKFNIRX6VF 420 

SGLYLVPLVT LLGCyVYBQV NRITWBITWV SDHOIQYHIP CPYQAKAKAR PELALPMIKY 4 BO 

UCTLIVGISA VFWVGSKKTC TEWAGPFKRN RKRDPISESR RVI.QESCEFP LKHNSKVKHK 540 

KKHYKPSSHK LKVISKSMGT STGATANHGT SAVAZTSEDY LGQBTLTEIQ TSPETSMREV 600 

KADGASTPRL REQDOGBPAS PAASZSRZ.S6 BQVDGKGOAG SVSBSARSBG RZSPKSDZTD 660 
TGLAQSHNLQ VPSSSEPSSL KGST6LLVBP VS6VRKBQGG QXSSBft 

Seq ID HO: 652 DNA sequence 

tfueleic Acid Acceseion «s NM.0I4791.I 

coding sequence: 171.. 2126 

1 11 21 31 41 51 

I I I ) i I 

TTGGCX3GGCG GAAGOGGCCA CAACCOGGOQ ATOGAAAAQA TTCTTAGGAA CQCOGTACCA 60 

GCCGOGTCTC TCAGGACAGC AGGCCCCTGT CCTTCrGTCG GGCGCCGCTC AGCCX3TGCCC 120 

T0CX3CCCCTC AGOTTCTTTT TCTAATTCCA AATAAACTTG CAAGAGGACT AT6AAAGATT 180 

ATGATCAACT TCTCAAATAT TATGAATTAC ATGAAACTAT TGGGACAGGT GGCTTTGCAA 240 

AGGTCAAACT TGCCT6CCAT ATCCTTACTG GAGAGATGGT ASCTATAAAA ATCATGQATA 300 

AAAACACACT AGGGAGTGAT TTGCCCCG6A TCAAAACGGA GATTGAGGCC TTGAAGAACC 360 

TGAGACATCA GCATATATGT CAACTCTACC ATGTGCTAGA GACAGCCAAC AAAATATTCA 420 

TGGTTCTTGA GTACTGCCCT GGAGGAGAGC TGTTTGACTA TATAATTTCC CAGGATCGCC 480 

T6TC3U3AA6A 6GAGACC0GG GTTGTCTTCC C?rCAGATAGT ATCTGCT6TT GCTTATGTGC 540 

ACAGCCAGGG CTATGGTChC AGGGACCTCA AGCCAGAAAA T TlX^CriS T l't GATGAATATC 600 

ATAAATTAAA GCTGATTGAC TTTGGTCTCT GTGCAAAACC CAAGGGTAAC AA66ATTACC 660 

ATCTACAGAC ATGCTGTGGG AGTCTGGCTT ATGCAGCACC TGAGTTAATA CAAGGCAAAT 720 

CATATCTTGO ATCAQAGGCA GATGTTTG6A GCATGGGCAT ACTGTTATAT GTTCTTATGT 780 

GIGaATTTCT ACCATTTGAT 6ATGATAAZG TAAT06CTTT ATACAAGAAG ATTATGAGAG 840 

6AAAATAT6A TGTTCCCAA6 T66CTCTCTC CCAGTAGCAT TCT a C l TC l " f CAACAAAT6C 900 

TGCAGGTGGA CCXIAAAGAAA CGGATTTCTA TGAAAAATCT ATTGAACCAT CCCTGGATCA 960 

TGCAAGATTA CAACTATCCT GTT(2A0TGGC AAAGCAAGAA TCCTTTTATT CACCTCXSATG 1020 

ATGATTGOGT AACAGAACTT TCTQTACATC ACAGAAACAA CAGGCAAACA ATGGAGGATT 1080 

TAATTTCACT GT60CAGTAT GATCACCTCA OGGCTACCTA TCITCTGCTT CTAGOCAAGA 1140 

AGGCTCOGGG AAAACCAGTT C6TTTAAGGC mViTCViT CT CC TGTGGA CAAGCCAGTG 1200 

CTACCCCATT CACAGACATC AAGTCAAATA ATTGGAGTCT GGAAGATGTG ACOGCAAGTG 1260 

ATAAAAATTA TGT6G0GGGA TTAATAGACT ATGATTGGTG TGAAGATGAT 7TATCAACAG 1320 

GTGCTGCTAC TOCOQGAACA TCACA67TTA CCAAGTACTG GACAGAATCA AATGGGGTG6 1380 

AATCTAAATC ATTAACTCCA GCCTTATGCA GAACACCTGC AAATAAATTA AAGAACAAAG 1440 

AAAATGTATA TACTCCTAAO TCTX3CTGTAA A6AATGAAGA GTACTTTATG TTTOCTGAGC 1500 

CAAAGACTCC AGTTAATAAC AACCAGCATA AGAGAGAAAT ACTCACTA06 CCAAATOGTT 1560 

ACACTACACC CTCAAAAGCT AGAAACCAGT GGCTGAAAGA AACTCCAATT AAAATACGAG 1620 

TAAATTCAAC AGQAACAGAC AAGTTAATGA CAGGTGTCAT TA6CCCTGAG AGGOGGTGCC 1680 

OCTCAOTOOA AT7GGA7CTC AACCAAGCAC ATATGOAOGA GACTCCAAAA AOAAAGOGAG 1740 

CCAAAGTGTT T(K3QAGCCTT GAAAGGGGQT TGQATAA8GT TATCACT6T6 CTCACCAGGA 1800 

GCAAAAGGAA GGGTTCTGCC AGAGACGGGC CCAGAAGACT AAAGCTTCAC TATAATGTGA 1860 

CTACAACTAG ATTAGTGAAT CCAGATCAAC TGTTGAATGA AATAATGTCT ATTCTTCCAA 1920 

AGAAGCATGT TGACTTTGTA CAAAAGGGTT ATACACTGAA GTGTCAAACA CAGTCAGATT 1980 

TTGGQAAAGT GACAATOCAA TrTQAATTAO AAGTGTGCCA OCTTCAAAAA CCOGATGTGG 2040 

TGGGTATCAG 6A0QCAG0Q6 CTTAAGGGG6 ATGCCTGGGT TTACAAAA6A TTAGTGGAA6 2100 

ACATOCTATC TAGCTGCAAG GTATAATTGA TGGATTCTTC CATCCT6CCX5 GATGAGTGTG 2160 

GGTGTGATAC AGCCTACATA AAGACTGTTA TGATCGCTTT GATTTTAAAG TTCATTGGAA 2220 

CTACCAACTT GTTTCTAAAG AGCTATCTTA AGACCAATAT CTCTTTGTTT TTAAACAAAA 2280 

GATATTATTT TGTGTAT6AA TCTAAATOUl GCCC ATCTG T CATTATCJTTA CTGTCTTITT 2340 

TAATCATGTG GTTTTGTATA TTAATAATIG TTOACTTTCT TAQATTGACT TCCATATGTG 2400 

AATGTAAGCr CTTAACTATQ TCTCTTTGTA ATGTGTAATT TCTTTCTGAA ATAAAACCAT 2460 
TTGTGAATAT 

Seq ZD NOt 653 Protein sequence 
Protein Accession ff: NP_055606.l 

1 11 21 31 41 51 

I I I I I t 

MKDYDELLKY YELBBTZGTG GFAXVRLACB ILTGEMVAXR ZMDKHTLGSD LPRIKIEZEA 60 

LXNUmQHIC OLYRVLETAN KIPMVLEYCP GGELFDYIZS QDRLSEEETR WFRQIVSAV 120 

AYVHSQGYAH BDLRPENLLF DEYHKLKLID FGLCAKPK07 KDYEIiQTCCG SLAYAAPELI 180 

QGKSYLGSEA DVWSr«3ILLY VLMOGFLPFD DDKVMALYKK IMRQKYDVPK WLSPSSIIiLL 240 

QQMZiQVDPKK RZSMIQIIiXjMB PHZMQDYNYP VENQSKNPFZ HUSDDCVTEb SVBHSNMRQT 300 

HEDLISLHQY DBLTATYLLL LAKKARGKPV RIALSSFSCG QASATPFTDI RSKHWSLSDV 360 

TASDKtfyVAG LXDYOHCEDD LSTGAATPST 5QFTKYHTBS NGVSSKSLTP ALCKTPANKL 420 

KUKBSVYTPK SAVKNEEYPM FPEPKTPVNK NQHKREILTT PNRYTTPSKA IINOCLKETPI 480 

KIPVKSTGTD KUfTGVISPE RRCRSVELDL KQAHMBBTPK RKGAKVPGSL ERGLDKVITV 540 

LTRSKSXGSA 2UX3PRSLKLH YKVTTTRLVN PDQLIiNEZMS ILPKXRVDFV QKGYTLKOQT 600 
QSDFGKVTMQ FELBVOQLQK FIIWGZRRQR LKtSSAHVYKR LVEDILSSCK V 

Seq ID NOz 654 DHA sequence 
Nucleic Acid Accession #: NM_000582 
Coding sequence: 88.. 990 ~ 

1 11 21 31 41 51 

I i I I I I 

GCAGAGCACA GCAT0GTCC50 6ACCAGACTC GTCTCAGGCC AGTTGCAGCC TTCPCAGCCA 60 

AAOGCOQACC AAGGAAAACT CACTACCATG AfiAATTGCAG TGATTTGCTT TTGCCTCCTA 120 
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GGCATCACCT GTGCCATACC AGTTAAACAG GCTGATTCTG GAA6TTCTGA GGAAAAGCA6 180 

CTTTACAACA AATACCCAGA TGCTGTGGCC ACATG6CTAA ACCCT6ACC3C ATCTCAGAAG 240 

CAGAATCTCC TAGCCCCACA GWXCTTCCA AGTAAGTCCA AC6AAAG0CA TGACCACATO 300 

GATGATATGG ATGATGAAGA TGATGATGAC CATGTGGACA GCCAGGACTC CATTGACTCG 360 

AAOGACTCTG ATGATGTAGA TGACACTGAT GATTCTCACC ASTCTGATGA GTCTCAOCAT 420 

TCTGATGAAT CTGATGAACT GGTCACIGAT TTTCCCA06G ACCT6CCAGC AACO GAAGT T 480 

TTCACTCCA6 7TGTCCCCAC AGTAGACACA TATGATGGCC GA6GTGATA0 TGTGGrTTAT 540 

GGACTGAGGT CAAAATCTAA GAAGTTTCGC AGACCTGACA TCCAGTACCC TGATGCTACA 600 

6A0GAGGACA TCACCTCACA CATGGAAAGC GAGGAGTTGA ATGGT6CATA CAAGGCCATC 660 

CtXCTTGCCC AGGACCTGAA 0606CCTTC7 GATTGGGACA GCOSTGGGAA GGACAGTTAT 720 

GAAAOGAGTC AGCT6GATGA CCAGAGT6CT GAAAOCCACA GCCftCAA6CA GTCCAGATTA 780 

TATAAGOOGA AA60CAATGA TGAGA6CAAT GAGCATTCCG ATGTGATT6A TAGTCAGGAA 840 

CTTTCCAAAO TCAGOOGTGA ATTCCACAGC CATGAATTTC ACAGCCATGA AGATATGCTG 900 

GTTGTAGACC CCAAAAGTAA GGAAGAAGAT AAACACCTGA AATTTOGTAT TTCTCATX3AA 960 

TTA6ATAGT6 CATCTTCTGA GGTCAATTAA AAGGABAAAA AATACAATTT CTCACTTT6C 1020 

ATTTAGTCAA AAGAAAAAAT GCTTTATAGC AAAATGAAAB AGAACA1GAA ATGCTTCTTT 1080 

CTCAGTTTAT TGGTTGAATG TGTATCTATT TGAGTCTGGA AATAACTAAT GTGTTTGATA 1140 

ATTAGTTTAO 'I T fGTGGCrf CATGGAAACT CCCTGTAAAC TAAAAGCTTC AGGGTTATGT 1200 

CTATQTTCAT TCTATAGAAQ AAATGCAAAC TATCACTGTA TTTTAATA7T TGTTATTCTC 1260 

TCATGAATAG AAATTTATGT AGAAGCAAAC AAAATACTTT TACCCACTTA AAAAGAGAAT 1320 

ATAACATTTT ATGTCACTAT AATCTTTTGT TTTTTAAGTT AGTGTAXATT ITGTTGTGAT 1380 

TATCTTTTTO TGGTGTGAAT AAATCTTTTA TCTTGAATQT AATAAGAATT TGGTGgPS TC 1440 

AATTGCTTAT Tlx yrr iTCCU AOQGTTGTCC AOCAATTAAT AAAACATAAC CTTTTTTACT 1500 
GOCTAAAAAA AAAAAAAAAA AAAA 

Seq ID £70: 655 Protein sequence 
Protein Accession ft: HP_000573 

1 11 21 31 41 51 

I I I I I I 

HRIAVICFCL LGITCAIPVK QADSGSSEEK QLYKKYPDAV ATHLNFDPSQ KQNLLAPQTL 60 

PSKSNESEDH MDOKDDEDDD DHVDSQDSID SNDSDDVDDT DDSHQSDBSB HSDE5DELVT 120 

DFFTDIiPATB VFTFWPTVD TYDGSGDSW YGLRSKSKKF KRSOIQYFDA TDEDZT5HME 180 

SBBLNGAYKA IPVAQDUIAP SDNDSRGXDS YETSQLDDQS AETHSBEQSR LYKRKAMDES 240 
NEHSDVZDSQ ELSKVSREFH SHSFH5BEDN LWDPKSKBB DXBLXFRISB ELDSASSSVN 

Seq ID NO: 656 DNA sequence 

Nucleic Acid Accession fts NN_003108.1 

Coding sequence! 76.. 1401 

1 11 21 31 41 51 

i I 1 I I I 

GGGGTGG6AG GGGGAGGGGG ACCTC06CAC OAGACCCAGC GOCCCGGGTT G6AG0STCCA 60 

GCCCTGCAAC GGATCATGGT GCAGCAGGCQ GAGAGCTTGG AAGCXK3AGAG CAACCT6CCC 120 

OGGGAGGCGC TGGACAOGQA GGAG0G06AA TTCATG6CTT GCAGCCXX»3T GOCCCTGGAC 160 

GAGAGCGACC CAGACTGGTG CAAGACGGOG TOGGGCCACA TCAAGOGGCC 6ATGAACGCG 240 

TTCATGOTAT GGTCCAA6AT OGAACGCAGG AAGATCAIQG AGCA07CTCC 6GACAT6CAC 300 

AACGCC6AGA TCTCCAAQAG GCTGGOCAAG aSCTGQAAAA TQCTGAAGOA CAG06AGAA6 360 

ATCCCGTTCA TCOGGGAGGC GQAGOOOCTG CG6CTCAAGC ACATGGCCGA CTACCCOGAC 420 

TACAAGTACC OGCCCCGQAA AAAGCCCAAA ATGGACCCCT CGGCCAA6CC CAGCGCCAGC 480 

CAGA6CCCA0 AGAAGAGOGC GGCOGGOGGC GG0GGCX3GGA GCGCG660GG AOOCGCGGGC 540 

QOTGCCAAOA CCTCCAAGGQ CTCCAGCAAO AAATG08GCA AGCTCAAGQC OOCOGCX36CC 600 

6CG6G06CCA AGGCGGG06C GGGCAAGGCG GCCCAGTC08 0G6ACTA06G aGGCX;OGQGC 660 

GACGACTAOG TGCTGGGCAG CCTG0G08TG AGOGGCTCGO QCGGCGGCGG 06CGGGCAA6 720 

AOSGTCAAGT G0GTX5TTTCT GGATGAGSAC GACGACGA06 ACGAOGACGA C3SACGAGCT0 780 

CAGCTGCAGA TCAAACAOOA GCCGGAOGAG GAGGA0GA66 AACCACOSCA CCAGCAGCTC 840 

CTGCA6000C CGGQGCMSCA GCCGTGOGAG CTGCTGAGAC 6CTACAA08T OOCCAAAOTO 900 

CCOSCCAGCC CTAOGCTGAG CAGCTGGGOO GAGTCCCCC3 G AGGGAOOSAG CCTCTAOGAC 960 

GAGGTGOSGG COGGCGOGAC CTCGGGC6CC GGGGGCGGCA GCOQCCTCTA CTACAGCTTC 1020 

AAGAACATCA CCAAOCAGCA CC0GC06CCG CTCGCGCAGC CCOCGCTGTC GCCCGCGTCC 1080 

TGGO6CTO0Q TOTCCAOCTC CTOSTCCAGC AGCAGCX3GCA GCAGCAGCG6 CAGCAGOGGC 1140 
GAGGAOOOOG AOGACCTGAT GTTaSACCTG AGCTTQAATT TCTCTCAAAO OGCGCACAGC ' 1200 

GCCAGCX3AGC AGCAGCTGGG GGG0GG0GO6 GGQ6CG8GQA ACCT O T C OCT GT06CTG6TG 1260 

GATAAGGATT TGGATTOGTT CAGOGAGGGC AGOCTOGGCT CCCACTTCXSA GTTCCCOGAC 1320 

TACTGCAOGC CGGAGCTGAG CGAGATGATC GCGGGGGACT GGCTGGAGGC GAACTTCTCC 1380 

GACCTGGTGT TCACATATTG AAAGOOGCXX: GCT G CTOGCT CTTTCTCTOG GAGGGTGCAO 1440 

AGCTGGGTTC CTTGGGAGGA AGTTOTASIG 6T6AT6ATGA TGATQATGAT AATQATQAIG 1500 

ATGATGGTGG TGTTGATGGT GGCGGTGGTA GGGTGGAGGG GAGAGAAGAA QATGCTGATG 1560 

ATATTGATAA GATGTOQTGA CGCAAAGAAA TTGGAAAACA TGATGAAAAT TTTGGTGGAG 1620 

TTAAAGTGAA ATGAGTAGTT TTTAAACATT TTTOCTOTOC ' mTl TrGTC CCCCCTCCCT 1680 

TCCTTTATOS TGTCTCAAGG TAGTTGCATA CCTA6TCTQG AGTTGTGATT ATTTTCCCAA 1740 

AAAATGTGTT TT7GTAATTA CTATTTCTTT TTOCTGAAAT TOGTGATTGC AACAAAG6CA 1800 

GAGGGGGOGG CX3GGG0GGA0 GGQAGGTAG6 ACCOQCTOOO GAAQGGOCTG TTTGAAGCTT 1860 

GTOGGTCm 6AAGTCTGGA AGA06TC1GC AQAGGACCCT TTTG6CA0CA CAACTGRAC 1920 

TCTAGGGAGT TG6T6GAGAT ATTTTTTTTT CTTAAGAQAA CTTAAAOAAC TGGTGATTTT 1960 
TTTTTAACAA AAAAAGQG 

Seq ID NO: 657 Protein sequence 
Protein Accession #: NP_003099.1 

1 11 21 31 41 51 

I I I I 1 I 

MVQQAESLEA ESNLPREALD TEBGEFKACS PVAXtDESDPD WCKTASGRIK ftPMNAFMVWS 60 

RIBRRKIMBQ SH3MHNAEI8 KRLGKRWRKL KDSEKIPPIR EAERLHLKHN ADYPDYKYRP 120 

RKKFKMDPSA KPSASQ5PBK SAAG6GG6SA GOGAGGAKTS RGSSKKCXSCL KAPAAAC3UCA 180 

GAGKAAQSGD YGGAGDDYVL GSLSVSGSGG GGAGKTVRCV FLDEDDDDm) DDDELQLQIR 240 
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QEPDBEDBBP VBQQIiI«FPO QOPSQUiRRy HVAKVPASST L8 S SA8 OT E0 A8LYDBVRA0 
ATSGkOGGSR IiVYSFRNITX QBmtMifK LSPASSRSVS TSSSSSSOSS 56SS6BDM)D 
IMFDLSLHFS QSKBSASEQO UJOOAAMaiL SLSLVDXDU> SFSBaSUSB FEFPOVCTPB 
IfEMIMOHL EMIF8DLVFT t 

Seq ID BOi 658 Wk sequaaes 
Nucleic Acid Accession «> 10(_001719 

Coding eequencet 123.. 1418 



PCTAJS02/12476 



300 
360 
420 



1 
I 

GGQCGCAGCG 
CTQCCACCTG 
CGATGCAC3GT 
CCCTGTTCCT 
GCTTCATCCA 
CCATTTTGGG 
OCATGTTOir 
GCCAGGGCTT 
GCXT6CAAGA 
TGGAAOITGA 
TTTCCAAGAT 
ACHTCGGGGA 
A6CACTT0G6 
A0GAGGGCT6 
GGCACAACCT 
AGTTGGOGGG 
TCTTCAAGGC 
GCCAGAACOG 
AGAACA6CAG 
GAGACCTGG6 
A6GGGGAGT0 
AGAOSCIGOT 
A6C7CAATGC 
ACAGAAACAT 
TTGGGGCCAA 
CT6CCTTTTG 
AAACA7GAGC 
TCCTACAA6C 
GCC6GGCCAG 
TTATGAGCGC 
GG6CACATT6 
CAATAAAACS 



11 
I 

GG6CC06TCT 
GGGCXK3TGCG 
GCGCTCACTG 
GCTGCGCTCC 
C06GC6CCTC 
CTTGCCCCAC 
6C1X3GACCT6 
CTOCTACCCC 
TAGCCATTTC 
CAAGQAATTC 
CCCAGAAGGQ 
AOGCTTOGAC 
CAGGGAATCG 
CCTGGTGTTT 
GGGCCTGCAG 
CXnXSATTGGG 
CA0GQAG6TC 
CrCCAAGAOG 
CAGCGACCAG 
CTGGCAGGAC 
TGCCTTCCCT 
OCACTTCATC 
CATCTCCGTC 
GGTGGTCOGG 
GTTTTTCTGG 
TGAGACCTTC 
AGCATATGGC 
TGTGCAGGCA 
GTCATTGGCT 
CTACCAGCCA 
GTGTCTGTGC 
AATOAATG 



21 
I 

GCA6CAAGTG 
6GCC0GGAGC 
CGAGCTGOGG 
GCCCTGGC03 
CGCAGCCAGG 
0GCCCGCC5CC 
TACAACGOCA 
TACAAGGC06 
CTCACOGACG 
TTCCACCCAC 
QAAGCTGTCA 
AATGAGACGT 
GATCTCTTCC 
GACATCACAG 
CTCTCGGTGG 
0G6CACGG6C 
CACTTCOGCA 
CCXIAAGAACC 
AGGCAG6CCT 
TGQATCATOO 
CTGAACTOCT 
AAOCOGGAAA 
CTCTACTTCX3 
GCCTGTGGCT 
ATCCTCCATT 
CCXTOCCTAT 
TTTT6ATCA6 
AAACCTAGCA 
GGGAAGTCTC 
GGCCACCCAG 
GAAAGGAAAA 



31 
I 

AC0GA06GCC 
COGGAGCCOG 
OGGOGCACAG 
ACTTCAGCCT 
AGOOGCGGGA 
CX^CACCTCCA 
TOGOGGTGGA 
TCTTCAGTAC 
CCGACATGGT 
GCTACCACCA 
OGGCAGCOGA 
TCOGGATCAG 
TGCTCGACAG 
CCACCAGCAA 
AGACGCTGGA 
CCCAGAACAA 
GCATC08GTC 
AGGAAGCCCT 
GTAAGAAGCA 
CGCCTGAAGO 
ACATGAA06C 
0SG7G00CAA 
ATGACA6CTC 
GCXIACTAGCT 
GCTCGCCTTQ 
CCGCAACTTT 
TTTTTCAGTG 
GGAAAAAAAA 
AGCCATGCAC 
CCGT6GGAGG 
TTGACCOGGA 



Seq ID NOt 659 Protein sequence 
Protein Accession fft NP_001710 



MHVRSLRAAA 
ILGLPHRPRP 
LQDSHFLTDA 
ZRERFDNSTP 
HNLGLQLSVS 
QNRSKTPXKQ 
GECAFPLNSY 
RISMWRA06C 



11 
I 

PHSFVALWAP 
HLQGKmiSAP 
DMVMSFVNLV 
RZSVYQVLQB 
TLOOQSZNPK 
EAIiBMAUVAB 
MNATNHAIVO 
K 



21 

I 

LFLLRSALAD 
MFMLDLYNAM 
EHDKEFFHPR 
HLGRBSDLFIi 
IiAGLIGSBOP 
NSSSDQRQAC 
TLVHFZNFET 



31 
I 

FSLDKFVHSS 
AVEEGGGPGG 
YHHREPRFDL 
LOSRTLMEASB 
QNKQP PMVAP 
KKHELYVSPR 
VPKPCCAPTQ 



Seq ID NO: 660 ONA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequences 211.. 1895 



GGATCTQAGG 
GGGGCTTGGG 
GAGGAATTAT 
TGATTTTTTT 
GTGCTTTTTC 
CACAGGTTCC 
CTTGTOCTQA 
GAAOGTAATT 
AAAATAT066 
TTCOGACACT 
GCCAATTATT 
TTCTTTGAAC 
GCTGTGGCTA 
CACATGCACT 
GTAGTCCAT6 
CAAAATTOCA 
GTTGTGAT6T 
TAOCTGCATA 
ATCTTGATAG 
ACTCTQGCTG 
GCAOGGATCT 
CTAOCTACCA 
AAACTGQOCA 



11 
I 

GGOGCCCAGT 
AGGCAGCCTG 
CTGATAAAAT 
CCCTOG AA AA 
TTTTCTCTTC 
TTGAACAGCT 
AAGCGAAAGT 
6TTTCCCTGA 
CTGTTCCATG 
GTAACCCCAA 
CAGACTGCCT 
GCCTCTATGT 
TTCTCATCAT 
TATTT6TGTC 
CTCACATAG6 
TT6AGGCAAC 
TTATTTACTT 
ATCTCATCTT 
GCTGGGGGTT 
ATGOGAGGTG 
TAGCAGCTAT 
AAATCTOGGA 
AATOGACACT 



21 

I 

CACTTCCTCC 
CTCTCCAGTC 
TCCTGGGTTA 
TGACCTTTTT 
TTTTTCTAOQ 
GGATTCT6AT 
ACAATGTGAA 
ATGGGA7GGA 
CCXTTCCTTAT 
TGGAACATGG 
TCGCTTTCTG 
AATGTATACC 
TOGTTACTTC 
TTTCATGCTG 
AGTAAA6GAG 
TTCTGTGGAC 
CCTGGCXACA 
TGTGGCTTTC 
TCCAGCAGCA 
CTGGGAACTT 
TGG6CTGAAT 
GACCAATGCA 
G O TOCTSGTC 



31 

I 

AOGTTCTCGT 
CCTATCCACC 
ATA7TTTTAA 
ATQCTTOGAA 
ATAAATGAAA 
GGCACCATTA 
CTCAACATCA 
CTCATTTGTT 
ATTTAT6ACT 
GATTTTATQC 
CAGCCAQATA 
GTTGGCTACT 
AGACGATT6C 
AGAGCTACAA 
CTG6AGT0CC 
AAATCACAAT 
AATTATTATT 
TTTTCGGACA 
TTTGTTGCA6 
AGTQCTGQAO 
TTTATTCTCT 
GTTOGGCATG 
CTAOTCTTTG 



41 
I 

GGGA0GGC06 
GGTAGOGCGT 
CTTCGTGGCG 
OGACAAOGAG 
GAT6CAGCGC 
GGGCAAGCAC 
GGAG66C66C 
CCAGGGCCCC 
CATGAGCTTC 
TOGAGAGTTC 
ATTCOGGATC 
CGTTTATCA6 
CCGTACCCTC 
CCACTGGGTG 
TGGGCAGAGC 
GCAGCCCTTC 
CAOGOGGAGC 
60QGATG6CC 
OGAGCTGTAT 
CTACGC06CC 
CACCAACCAC 
GOCCTGCZGT 
CAA06TCATC 
CCTCOGAGAA 
GCCAGGAACC 
AAA66TGIGA 
GCAGCATCCA 
ACAAOGCATA 
GGACT06TTT 
AAGGQGGCGT 
AGTTCCTOTA 



41 

I 

FIHRSLRSQB 
QGPSYPyKAV 
SKIFEGEAVT 
EGHLVFSITA 
FKATEVBFR8 
DLGHQDHIIA 
UIAISVLYFD 



41 

I 

6CTGGG0GGG 
CACAGGTTTT 
AAAOGGAGAO 
QCAGTTTGTC 
GCATTTCTTC 
CTATAGAGGA 
CAGCTCAACT 
GGCCCAGA60 
TCAACCAXAA 
ACAGCTTAAA 
TCAGCATAQG 
CCATCTCTTT 
ATTGCACTAG 
GaWCTTTGT 
TAATAATGCA 
ATATOGGGTG 
GGATCXTPGGT 
CCAAATACCT 
CATGGGCTGT 
ACATCAAGT6 
TTCTGAATAC 
ACACAAGGAA 
(3U3TGCATTA 



51 
I 

CCTGOOCCCT 
AGAGC06G06 
CTCTGGGCAC 
GTCCACTOGA 
GAGATCCTCT 
AACT06GCAC 
6GGCO0G0C6 
CCTCTGGCCA 
GTCSVACCTCXS 
CGGTTTGATC 
TACAAGGACr 
GTGCTCCAGG 
TGGGCCTGGG 
GTCAATCCGC 
ATCAACCCCA 
ATGGTGGCTT 
AAACAGOGCA 
AA06TGGCAG 
GTCAGCTTCC 
TACTACTGTG 
6CCAT0GT6C 
GCG0CCAC6C 
CTGAAGAAAT 
TTCAGACCCT 
AGCAGACCAA 
GAGTATTAGG 
AT6AACAA6A 
AAGAAAAAT6 
OCAGAGGTAA^ 
GGCAAGGGGT 
ATAAATGTCA 



51 

I 

RREKQHEILS 
FSTQGPPLAS 
AAEFRIYKDY 
TSNRtrWNPR 
IRSTGSKQRS 
PBSYAAYYCB 
OSSNVILKKY 



51 

I 

AGGAGCGGAT 
TTGGGTOSGA 
TTTTTAAAAA 
AACCAGCATA 
AAGAAAAAG6 
GCAGATTGTC 
CCAGGAGGGA 
AACAGTG6GG 
AOGAGTTGCT 
TAAAACATG6 
AAAGCAAGAA 
TGGTTCCTTG 
GAACTATATC 
CAAAQACAGA 
06ATGACCCA 
CAAGATTGCT 
GGAAGGTCTC 
GTG6GGCTTC 
G6CAC6AGCA 
GATTTATCAA 
GGTTAGAGTT 
GCAATACAGG 
CATC3DT0TTC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
i020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 



60 
120 
180 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
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GTAT6CCTGC CTCACTCCTT CACTQGGCTC GGGTGQGAGft TOOGCATGOl CIGTGAGCTC 1440 

TTCTTCAACT CCTTTCA0C5G I ' XlVmW t ? TCTATCWCT ACTGCTACTO CRATGSAGAG 1500 

GTTCAGGCA6 AGGTGAAGAA GATGTGGAGT OGGTGGAATC TCfCOGTGGA CTGGAAAAGG 1560 

ACACCGOCAT GTGGCAGCX3G CAGATGOGGC TCAGTGCTCA CXACOGTGAC GCACAGCACC 1620 

AGCAGCX3VGT CACAGGTGGC G6CCAGCACA OSCATGGTGC rXATCTCTGQ CAAAGCTGCC 1660 

AAGATOQGCA 6C3U3ACA0CC TGACAGCCAC ATCACTTTAC CTGOCTAtGr CXG6A6TAAC 1740 

TCAGAGGAGO ACTGOCIQCC ACACTCTTTC CAOSAGGAGA CCAAGGAAGA TAGTGGGAGG IBOO 

CAOGGAGATO ATATTCTAAT GGAGAAGGCT TGCAGGCCTA TGGAATCTAA CCCAGACACT 1B60 
GAAGGATGOC AAGGAGAAAC TGAGGAT6TT CTCTGA 

Seq ID HO I 661 Protein sequence 
Protein Accession fti Bos sequence 

1 11 21 31 41 51 

I I I I 1 i 

NtiRSSLSTSZ VLFIJSSFSr INBS2SSRXR fiRFLEQLDSD OTITISBQIV LVUCAKVQCZ 60 

LNZTAQLQBO EGHCFPSHDG LICWPRGTVG KISAVPCPPY lYDFNHRGVA PRBQiniGTW 120 

DFHKSXiHKTW ANYSDC3:iRFL QPDISIGKQB FPERLYVMyT VGY8ISFGSL AVAILIIGYP 180 

RRLUCTRNYI HMHLFVSFML RATSZFVKDR WHAHZGVKB LBSZiINQDDP QUSIEATSVD 240 

KSQYIGCECIA WMPIYFZAT imWZLVBGL YLHNLZEVAF PSDTKnjMGP ZL1GH6FPAA 300 

PVAAMAVAHA TLADARCHEL SAGDIKWIYQ APILAAZGUI FILPUITVRV LATKZHBIMA 360 

VGHDTRKQYR KLAKSTLVLV LVFSVHYIVP VCLFHSFIGKi GHEISMBCEL FFSSFQ6FFV 420 

SZIYCyOIGE VQAEVKKKWS RKNI«5VZ3WKa TPP06SRRC36 SVItTTVTHST SSQSQVAAST 480 

RMVLISGKAA KIASRQPDSH ZTLPGYVWSN SGQDCLPRSF HEETKEDSGR QGDDZLMEKP 540 
SRPMESNPDT EGOQGETEDV L 

Seq ID MOi 662 DMA sequence 
Nucleic Acid Accession ft: NM_005048 
Coding sequence t 143 . . 1795 

1 11 21 31 41 51 

I I I I I I 

GGCX3GGTGGC CCGGGCCCGA CCACCCCAGC TGCG0GTC6T TACTG6CCAC AAQTTTGCTC 60 

TGGGCXaVGCC AAGTTGGCAA CTTGGAAGCT TCTCCCGGGC TCTGGAGGAG GGTCCXTTGCT 120 

TCrrCCTACA GCGGTTCCGG OCATGGCCGG GCTGGGGGCG TCGCTCCACG TCTGGGGTTG 180 

GCTAATGCTC GGCAGCTGCC TCCTGGCCAG AGCCCAGCTG GATTCTGATG GC3^CCATTAC 240 

TATAGAGGAG CAGATTGTCC TTGTGCTGAA ,AGCGAAAGTA CAATGTGAAC TCAACATCAC 300 

AGCTCAACTC CAGGAGGGAG AAGGTAATTG TTTCCCTGAA TGGGATGGAC TCATTTGTTG 360 

GCCCAGAGGA ACAGTGGGGA AAATATCGGC TGTTCCATGC CCTCCTTATA TTTATGACTT 420 

CAACCATAAA GGAGTTGCTT TCCGACACTG TAACCCCAAT GGAACATGGG ATTTTATGCA 480 

CAGCTTAAAT AAAACA7X3GG OCAATTATTC AGACTGCCTT OSCTTTCTGC AGCCAGATAT 540 

CAGC31TAGGA AAGCAAGAAT TCTTTGAACG CCTCTATGTA ATGTATACCG TTGGCTACTC 600 

CATCTCITTT GGTTOCTTGG CTGTGGCTAT TCTCATCATT GGTTACTTCA GACGATTGCA 660 

TT6CACTAG0 AACTATATCC ACATGCACTT ATTTGTGTCT TTCATCCPGA GAGCTACAAG ' 720 

CATCTTTGTC AAAQACAGAG TAGTCCATGC TCACATAGGA GTAAAGGAGC TGGAGTCCCT 780 

AATAATQCAG GATGACCCAC AAAATTCCAT TGAGGCAACT TCTGTGGAC3V AATCACAATA 840 

TATOGGQTGC AAGATTGCTG TTGTGATGTT TATTTACTTC CTQGCTACAA ATTATTATTG 900 

GATCCTGGTG 6AAG6TCTCT A0CT6CATAA TCTCATCTTT OTQQCTrrCT TTTG06ACAC 960 

CAAATACCTG TGGGGCTTCA TCTT6ATAG0 CTGGGGQTTT GCAGCAGCAT T TG TT G CAGC 1020 

ATGOGCTGTG GCAOGAGCAA CrCTGGCTGA TG06AGGTGC TGGGAACTTA GTGCTGGAGA 1080 

CATCAAGTGG ATTTATCAAG CACOGATCTT AGCAGCTATT GGGCTGAATT TTATTCTGTT 1140 

TCTGAATACG GTTAGAGTTC TAGCTACCAA AATCTGG6AG ACCAATGCAG TTGGGCATGA 1200 

CACAAGQAAO CAATACAOGA AACTG6CCAA AT06ACACTG GTOCTGGTCC lAOTCmGO 1260 

AGTGCATTAC ATOGT G TTCG TATGCCTGCC TCACTCCTTC ACTGG6CTG8 GGTGGGAGAT 1320 

CX3GCATGCAC TGTGAGCTCT TCTTCAACTC CTTTCAGGGT TTCTTTGTGT CTATCATCTA 1380 

CTGCTACTGC AATGGAGAGQ TTCAGGCAGA GGTGAAGAAG ATGTGGAGTC GGTGGAATCT 1440 

CTCOSTQGAC TGGAAAAGGA CACOGCCATG TGGCAGCOGC AGATG06GCT CAGTGCTCAC 1500 

CAC06TQA00 CACAGCACCA GCAGCCAGTC ACAGGI GG Og GCCA6CACAC GCATGGTQCT 1560 

TATCtCTGGC AAAGCTGCCA A6AT06CCAG CAGACAGCCT GACAGCCACA TCACTTTACC 1620 

TGGCTATGTC TGGAGTAACT CAGAGCAGGA Cr G CCT G CC A CACTCTTTCC ACGAGGAGAC 1680 

CAAGGAAGAT AGTQGGAGOC AGGGAQATGA TATTCTAATG GAGAAGCCTT CCAGGCCTAT 1740 

QGAATCTAAC 0CA6ACACT0 AAGGATGCCA AGGAGAAACT GAGGATGTTC TCTGAATGGA 1800 

CATTTOIOGC TGACTTTCAT QGGCTGGTCC AATGGCTGGT T6TGTGAGAG G6CCTG6CTG 1860 

ATACTCCTAT GCITGAGTTC AAA06CT6AA AATTCAGITA AOSrOTTACT TAATAATAGT 1920 

TTTTAGGCTC CATGAATT6G CTCCTGTAAA TACTAACGAC ATGAAAATGC AAGTGTCAAT 1980 

GGAGTAGTTT ATTAOCTTCT ATTGGCATCA AGTTTTCCTC TAAATTAATG TATGGTATTT 2040 

GCTCTGTGAT TGTTCATTTT TTTCTGCTAC TTTTGGGTAG AAAAAAGATT CAATTGCTTG 2100 

GCT0TA6CTT TCTCTGATAT ATATCACCCT AAATATAATO AAGATCTTTT AGTGTGTATC 2160 

ATTTTCCTTT TAGAAACTAG TATTCTCTTA TTTCTTACTT TAATGTACTT CTATCACTGC 2220 

ATTTATTfTG CCTCTGCATA GGAGCAATTA GGATCTAAAA AAATATATQO QAAOATAAAA 2280 

GATCTAAGAA CMOThCTTG CTGGAAAATT AGTTGGCTGG ACATTGATAA AATAATGCAT 2340 

TTATAACAAT TACATGTGTT TTTGGGAACA AGGAAAATTT CTCAAAAAAG AATATTTCAC 2400 

ACATOOCTTC TTTTGAATGG OCTCTTTGTG ACCAGCCAGA CCTCAOGTCT TCACTCTTTC 2460 

TTCTTTQTAA ACCATGTCAT GTGGAAAGAT TTCCTCA6TT AGTGAGCTTG TGTCTGCAAA 2520 

TTQATTrrGT TT6TAATGTA TTTT6ATAGC AAATCATGCT GCATCTAXAT emTlCTiti 2580 

TTTGAGCTGT TACrAQVTTG TACATGGCAT GT0G6ATCAA 7TAAAAATTT GTTTTAAAAA 2640 
T 

Seq ZD MO: 663 Protein sequence 
Protein Accession ft: KP_005039 

1 11 21 31 41 51 

1 I I I I I 

MAGLQASZfV NGHLNLGSCL LARAQLDSDG TZTZBBQIVli VLKAKVQCEIi NZTAQLQEGB 60 

GNCFPEHDGL ZCWPRGTVGR ZSAVPCPPYZ YDPNEKGVAF RHQIFNGTOD FNHSU^lCrHA 120 

NYSDCXtRFLQ PDZ8ZGKQEF FERLYVMYTV GYSISFGSLA VAILZIGYTR RLHCTRNYZB 180 

MHZ1FV8FMI1R ATSZFVKDRV VHAHZGVKEL BSIiZMQDOPQ KSZEATSVOK GQyZGCKZAV 240 
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VMFIYFIATN YYWILVEGLY LHBLXFVAFF SDTRYLHGFI LI6K6FPAAF VAAKAVARAT 300 

LADARCWEZ18 AGDIKHIVOA PIZJUUGINP ILFUTrVRVL ATKIWB7KAV (SZyTRXOYRK 360 

LAK5TLVLVL VFGVHYIVFV CLFHSFTGLG KEIRKECBLF FKSPQGPFVS ZiyarCMGEV 420 

QAEVKKMWSR HNLSVDWKRT PP06SRR06S VLTTVTBSTS SQSQVAASTR MVLISGKAAK 480 

lASRQPDSRI TLPGYVWSirS EXZDCLPHSFB EETKEDS6RQ GDDILMEKPS RPMESHPDTB 540 
GOQGBTEDVL 

Seq ID KO: 6S4 Z»1A sequence 
Nucleic Acid Accession #x NN_012152 
Coding sequence: 43.. 1104 

1 11 21 31 41 51 

I I I ! 1 I 

CTTCTTTAAA TTTCTTTCTA GGATQTTCAC TTCTTCTCCA CAATGAATGA GTGTCACTAT 60 

GACAA6CACA TG6ACTTTTT TTATAATA06 AGCAACACT6 ATACTSTOSA TX^CTGGACA 120 

OGAAOUIAGC TTGTGATTGT ' mXaWKSI T GGGAOGTTTT TCTGOCTGTT TATTTTTTTT 180 

TCTAATTCTC TGGTCATCGC GGCAGTGATC AAAAACAGAA AATTTCATTT CCCCTTCTAC 240 

TACCTGTTGG CTAATTTAGC TGCTGCOGAT TTCTTOSCTG GAATTGCCTA TGTATTCCTG 300 

ATGTTTAACA CAGGCCCAGT TTCAAAAACT TTGACTGTCA ACOSCTGGTT TCTCCGTCAG 360 

GGGCTTCTGG ACAGTAGCTT GACTGCTTCC CTCACCAACT TGCTGGTTAT 06C0GTGGAG 420 

AGGCACATGT CAATCATQA6 GATGC6GGTC CATAGCAACC TGACCAAAAA GAG6GTGACA 480 

CTOCTCATTT TGCTT G TCTG GGCCATCGCC ATTTTTATGG GGGCGGTCOC CACACTGG6C 540 

TGGAATTGCC TCTGCAACAT CTCTGCCTGC TCTTCCCTGG CCCCCATTTA CAGCAGGAGT 600 

TACCTTGTTT TCTGGACAGT GTCCAACCTC ATGGCCTTCC TCATCATGGT TGTGGTGTAC 660 

CTGOGGATCT ACGTGTAOST CAAGAGGAAA ACCAACX5TCT TGTCTCOGCA TACAAGTGGG 720 

TCXIATCAGCC GCCS6AG6AC ACCCATGAAG CTAAT6AAGA 0GGT6ATGAC TSTCTTAGGQ 780 

GOGTTTGTGG TATGCTGGAC CCCGGGCCTG GTGGTTCTGC TCCTOGAOGG OCTGAACTGC 840 

A0GCA6TGT6 GOGTGCAGCA TGT6AAAAGG TGOTTCCTGC TGCTQOOGCT GCTCAACTCC 900 

GTOSTGAACC CCATCATCTA CTCCTACAAG GACGAGGACA TGTATGGCAC CATGAAGAAG 960 

ATGATCTGCT GCTTCTCTCA GGAGAACCCA GAGAGGOGTC CCTCTOGCAT CCCCTCCACA 1020 

GTCCTCAGCA GQAGTGACAC AGGCAGCCAG TACATAGAGG ATAGTATTAO CCAAGGT6CA 1080 

GTCTGCAATA AAAGCACTTC CTAAACTCTG 6ATGCCTCTC G6C0CAC0CA G6TGATGACT 1140 
GTCTTAOG 

Seq 10 NO I 665 Protein sequence 
Protein AccesBion #: NP_036284 

1 11 21 31 41 51 

I I I I I I 

MIIEC3IYDKHM DPFVKRSNTD TVDDHTGTKL VrVLCVGTFP CLFIFPSMSL VIAAVIKNRK 60 

FHPPFYYLLA NLAAADFFAG lAYVFLMFNT GPVSKTLTVN RWFLROGLU) SSLTASLTNL 120 

LVIAVESHMS IMRMRVHSNL TKKRVTLLIL LVWAIAXFKG AVPTLGMNCL OflSACSSLA IBO 

PZySRSYLVF MTVSKUfAFL INVWYIAIY VYVKHKTNVL 6PETSGSISR BRTPMKLMKT 240 

VMTVU3AFW CWTPGLWLli LDGLNCRQOG VQHVKRHFLL LALLNSWNP IIYSYKDEDM 300 
YGTMKKMICC PSQENPERRP SRIPSTVLSR SDT6SQYIED SXSQQAVOnC STS 

Seq ID NO: 666 DKA sequence 
Nucleic Acid Accession ft: NK_002621 
Coding sequencei 150.. 3362 

1 11 21 31 41 51 

I I I I 1 I 

AACTOOOSCC TCOOQAOGCC T060GGT0QG GCTCOGGCTG CGOCTGCTGC TGCQGOGOCC 60 

G08CTC0GGT GOGTCCGCCT CCTGTGCCOG COGOGGAGCA GTCTGCGGCC OGCC6TGO0C 120 

CCTCAGCTCC TTTTCCTGAG CCCGCCGCGA TGGGAGCTGC GGGGGGATCC CCGGCCAGAC 180 

CCCGCOSGTT GCCTCTGCTC AGOGTCCTGC TGCTGCCGCT OCTGGGCGGT ACCCAGACAG 240 

CCATTGTCTT CATCAAGCAG CCGTCCTCCC AGGATGCACT GCAQGGGCGC CGGGOGCTGC 300 

TTGGCTGTGA G0TTQAG6CT COQOGCCOQQ TACATGTGTA CTOGCT GCTC GATGgGCCC 360 

CTGTCCAG6A CACGQAGCGO COTTTCX^CCC AGGGCA6CAG CCTGAGCTTT GCAGCIGTGO 420 

ACOQGCTGCA GOACTCTGGC ACCTTCCAGT GTGTGGCTCG GGATGATGTC ACTQOAGAAG 480 

AAGCCCGCAG TGOCAACGCC TCCTTCAACA TCAAATGGAT TQAGGCAGGT CCTGTGGTCC 540 

TGAAGCATCC AOCCTCGGAA GCTGAGATCC AGCCACAGAC CCAGGTCACA CTTOGTTGCC 600 

ACATTQAIGG 6CACCCT030 GCX3U3CTAOC AATOGTTOCG AOATCGGAOC CCOCTTTCTO 660 

ATGGTCAGAG CAACCACACA GTCAGCAGCA A06AGC66AA CCTGACGCTC C660CAGCTG 720 

GTCCTGAGCA TAGTGGGCTG TATTCCTGCT GOGCCCACAG TGCTTTTGGC CAQGCTTGCA 780 

GCAGCCAGAA CTTCACCTTG AGCATTGCTG ATGAAAGCTT TGCCAGGGTG GTGCTGQCAC 840 

C0CAG6AC6T GGTAGTA60G AG6TATGAGG AGGCCATGTT CCATTGCCAG TTCTCA6CCC 900 

AGCCACCCCC GAGOCTGCAG TOtfClVlVfa AGOATGASAC TCCCATCACT AACOGCAGTC 960 

GCCCCCCACA CCTCCGCAGA OCCACAiSTOT TTGCCAAOGG GTCTCTGCTO CTQACOCAGO 1020 

TC06GCCA0S CAATQCAGGO ATCTACCCCT GCATTGGCCA GGGGCAGAGG GGCCCACCCA 1080 

TCATCCTGGA AGCCACACTT CACCTAGCAG AGATTGAAGA CATGCOGCTA TTTGAGCCAC 1140 

GGGTGTTTAC AGCTGGCA6C GAG6AG0GTG TGACCTGCCT TCCCCCCAAG GGTCTQCCA6 1200 

AGCCCAG06T G TGGTGGQAG CAG6CGG0AO TCGGGCZGCC CACCCAXGGC AGGGTCTACC 1260 

AGAAG6GCCA OSAGCTGGTG TTeGCCRATA TTGCTGAAAG TGATGCTOGT GTCTACACCT 1320 

GOCACGOGGC CAACCTGGCT GGTCAGOGGA GACAGGATGT CAACATCACT GTGGCCACTG 1380 

TGCCCTCCTG GCTGAAGAAG CCCCAAGACA GCCAGCTGGA GGAGGGCAAA CCCGGCTACT 1440 

TGG A TT G OCT GACCCAGGCC ACACCAAAAC CTACAGTTGT CTGGTACAGA AACCAGATGC 1500 

TCATCTCMSA GGACTCAC6G TTCGAGGTCT TCAAQAATGQ GACCTTGC6C ATCAACA60G 1560 

TGGAGGTGTA TGATGGGACA TQGTACOGTT GTATGA6CA6 CACCCCA6CC G6CA6CAT06 1620 

AGGOGCAAGC COGTGTCCAA GTQCTGGAAA A6CTCAAGTT CACACCACCA CCCCAGCCAC 1680 

AGCAGTGCAT GGAGTTTGAC AAGGAGGCCA CGGTGCCCTG TTCAGCCACA GOCOGAGAQA 1740 

AGCCCACTAT TAAGTGGGAA OGGGCAGATG GGAGCA6CCT CCCAGAGTGO GTGACAGACA 1800 

ACCCTGGGAC CCTGCATTTT QCCCGGGTGA CTCGAGATG3V OGCTGGCftAC tACACT TGCA 1860 

TTG C CTCCAA CGGGCCGCAG GGCCAGATTC 6IGCCCATGT CCA6CTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAGTGGAA CCAGAGGGTA OGACTGTGTA CCAGG6CCAC ACAGCCCTAC 1980 

TGCAGTGGGA GGCCCAQGGG GACCCCAAGC OGCTGATTCA GTGGAAAGOC AAGGACCGCA 2040 

TCCT66ACCC CACCAAGCTG GGACCCAGGA TGCACATCTT CCASAATOQC TCOCTGGTGA 2100 
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TCCATGAOrr GGCCCCTGAQ GACTCAGGCC GCTACACCTG 
ACATCAAGCA CAOCaAGGCC CCCCTCTATG TCGTGGACAA 
AGGGGOCTGO CAGCCCTCCC CCCTACAAGJV TGATGCASAC 
COO C TG reGC CTACATCATT GC05T6CTG6 GCCTCATGTT 
AAG0CAA6G6 6CTGCAGAA6 CA6CC06AGG GOGAGGAGOC 
GAGGGCCTTT GCAGAACGGG CAGCCCTCAG CAGAGATCCA 
GCTTGGGCTC OGGCCCOGOG GCCACCAACA AAOSCCACAG 
TCCCAOGGTC TAGCCTGCAO CCCATCACCA OGCTGGGGAA 
TCCTGGCAAA GGCTCAGG6C TTGGAGGAGG GAGTGGCAGA 
GCCTGCAGAC GAAGGAT6AG CAGCAQCAGC TGGACTTC06 
GGAA5CTGAA CCACGCCRAC GTX3GTG0GGC TCCTGGGGCT 
ACTACATGGT GCTGGAATAT GTGGATCTGO GAGACCTCAA 
ASAGCAAOGA TGAAAAATTO AAGTCACAGC CCCTCAGCAC 
GCACCCAGGT AGCCCT6GGC ATGGAGCACC T6TCCAACAA 
TGGCTGOGCXS TAACTGCCTG GTCAGTGCCC AGAGACAAGT 
TCA6CAAG6A TGTGTACAAC AGTGAGTACT ACCACTTCOS 
GCT6GATGTC CCCOGAGGCC ATCCTGGAGG GTGACTTCTC 
OCrrCGGTGT GCTGATGTGG GAAGT6TTTA CACATGGAGA 
CAGATGATGA AGTACTGGCA GATTTGCAGG CTGGGAA6GC 
GCTGCCCTTC CAAACTCTAT CGGCT6ATGC AGOGCTGCTG 
GGCCCTCCTT CAGTGAGATT GCCAG06CCC TG6GAGACAG 
GAGGAGGGA6 CCGGCTCAGG ATGGCCTGOG CAGGG6AGGA 
CA6CATGATG GGCAAGATCC CTGTCCTCCT GGGCCCTGAG 
TTGCTGAGGT CTGAGCAGGG CCTGGCCTTT CCTCCTCTTC 
GGCTSACrro GACCCAAACT 6GG0SACTAG GGCTTT G AGC 
CTCTTCCTCr ATCAGGGACA GT6TGGGT6C CACAG6TAAC 
TTCTCCCCTT GACCJGGGTCC AACTCTGOCA CTCATCTGCC 
A6GCTTGGGA TGAGCTGGGT TTGTGGGGAG TTCCTTAATA 
AGGGTTAATG AGTCTCTTGC CCACTCXTTCC ACTTGGGGGT 
ACAC31GCAAG TGAGTCXTTCC CCACTCTGGO CTTGTOCACA 
OCCCACCCTT CTCTCCTTTC CTCATCCTAA OTOCCT GQ CA 
CTTTTGAGAC TATATAAACC G0CCTTT7TG TAT6CACCAC 
TGCAGCXTTGG GGTGGGTGGG CAT6G6AGGT AGGG6TGGGC 
GCCATCCTTA CCCCACACTT TTATTGTTGT CGTTTTTTGT 

itjrrm ' criT tttacactgg ctgctctcaa taaataagoc 

seq ID NO: 667 Protein sequence 
Protein Accession #: NP_002ei2 



MGAARGSPAR 
VRVYWLLDGA 
IKWIEAGPW 
KERmiTLRPA 
EAKFROQFSA 
CZGQGQRGPP 
VRLPTHGRVY 
SQLEEGKPGY 
04SSTPAGSX 
GfiSLPEHVTD 
TTVYQGHTAL 
KYTCIAGNSC 
GLMPYCKKRC 
KRHSTSDKMR 
LDFRRELEMF 
PIiSTRQKVAL 
YHFROAWVPL 
AGKARIiPQPB 



II 

I 

PRRLPLLSVL 
PVQDTERRPA 
LKHPASEAEI 
GPEBSGLYSC 
QPPPSLQHLF 
IZLEATLBLA 
QKGHELVLAN 
LDCLTQATPK 
EAQARVQVLE 
NAGTIHPARV 
LQCEAQ6DPK 
NIKHTEAPLY 
KAKRLQKQPE 
FPRSSLQPXT 
GKLNHANWR 
CTQVALGHBH 
RWtSPBAZLE 
GCPSKLYRLN 



21 
I 

LLPLXXSGTQT 
QGSSLSFAAV 
QPOTQVTLRC 
CAHSAFGQAC 
EDETPZTNKS 



IAESDA6VYT 
PTWWYRNQM 
KLKFTPPPQP 
TBSOAGNYTC 
PliIQWRSKDR 
WDKPVPEBS 
GEEPEMEd^ 
TLGKSEFGEV 
LLGLCREAEP 
LSNNRFVHXD 
GEDFSTKSDVIf 
QRCNALSPKD 



31 
I 

AlVFIKQPSS 
DRLQDSGTFQ 
HIDGHPRPTY 
SSQNFTZ.SIA 
RPPBLRRATV 
RVFTASSEBR 
CHAANLAGQR 
LISEDSRFEV 
QQQIBFDKEA 
ZASNGPQGOZ 
IL0PTKL6PR 
BGPGSPPPYK 
GGPLQNGQPS 
FLAKAQGLEE 
HYMVLBYVDli 
LAARNCLVSA 
AFGVLMWBVP 
RP8PSBIASA 



CATTGCAGGC 
GCCTGTGCCG 
CATT6GGTTG 
CTACT6CAA0 
AGAGATGGAA 
AGAAGAAGTG 
CACAAG TGAT 
QAGTGAGTTT 
GACCCTGGTA 
GAGGGAGTTG 
6TGCCGGGAG 
OCAGTTCCTG 
CAAGCAGAA6 
CCGCTTTGTG 
GAAGGTGTCT 
CCAGGCCTGO 
TACCAAGTCT 
GATGCCCXAT 
TAGACTTCCT 
GGCCCTCAGC 
CAOOGTGGAC 
OITCTCTAGA 
GTGCCCTAGT 
CTCACCCTCA 
TGGGCAGTTT 
OCCAATTTCT 
AACTTTGCCT 
TTCTCAAGTT 
CTAGACCAGG 
CTGACCCAGA 
GATGAAGGAG 
GGG0G6CTTT 
CCTGGAGAT6 
TTGTTrrGTT 
TTTTTTA 



41 
I 

QDALQGRRAL 
CVARDDVTGE 
QNFHDGTPLS 
OESFARWLA 
FANGSLLtiTQ 
VTCLPPRGIiP 
RQDVNITVAT 
FKNGTLRINS 
TVPCSATGRS 
RAHVQLTVAV 
MBZFQNG5LV 
MIOTIGLSVG 
AEIQEEVALT 
GVAETLVLVK 
GDLRQFItRZS 
QRQVKVSAL6 



AACAGCTGCA 
GAGGAGTCGG 
TGGGTGGGTQ 
AAGOGCTGCA 
T60CTCAA0G 
GCCTTGACCA 
AAGATGCACT 
GGGGAGGTGT 
CTTGIGAAGA 
GAGATGTTTG 
GCTGAGCXXX: 
AGGATTTCCA 
GTGGCCCTAT 
CATAAGGACT 
GCCCTGGGCC 
GTGCCGC7GC 
GAXGTCTGGG 
GGTGGGCAGG 
CAGCCCGAGG 
CCCAAGGACC 
AC^AAGCCGT 
GGGAAGCTCA 
GCAACAGGCA 
TCCTTTGGGA 
CCOCTGCCAC 
QGCCTTCAAC 
GGGGAGGGCT 
CTGGGCACAC 
ATTATAGAGG 
CCCAOGTCTT 
TTTTCM36AG 
TATATOTAAT 



51 

I 

LRCEVEAPGP 
EARSANASFK 
DGQSNHTVSS 
PQSVWARYE 
VSPRNAGZYR 
BPSVHNEBAG 
VPSWLKKFQD 
VEVYDGTWYR 
KPTIKWERAD 
PITFKVEPER 
ZEDVAPEDS6 
AAVAYIIAVL 
SLGSGPAATN 
SLQTKDBQQQ 
RSKDEKIiKSQ 
LSKDVYNSEY 
ADDEVLADLQ 



LGDSTVDSKP 



Seq ID MOt 666 ONA sequence 

Hucleie Acid Accession ftt Bos sequence 

coding sequence: 1..1389 



ATGGGCTACC 
ACCCTTOTTT 
GTTGTCAACT 
GGGTTTCCTT 
GTTTTATTGA 
AAAACTTTC6 
ATAGCAATGA 
ATCCCAGGAG 
AGAGTTACCT 
TCCCTCATCT 
TCACTGG6TC 
ATTCAA6CG6 
TACAGTTCTC 
GTQATTTCTG 
TTCAOOCAAG 
AGATTTTGTT 
GAGGTAATTG 
ACAGTGATGG 
GTTCTAtSWlC 
TGTTATCTGA 
A3GCTT00CA 




51 
1 

TQACAGAGAA 
TCTTTTTAAT 
GAAGCAAGCT 
CTTTTCCCTT 
TTTG6TCAAT 
GTATCCTTTT 
TTTTCAAAGA 
TGGACTTTCC 
TG6AAAGGTC 
AA6QGCAATT 
GOCCAATGCC 
CTTCTTAGTT 
TATGTCCATC 
ATTTACTGGC 
AACATTTGGA 
TG TGACA AGA 
CATTGTTGTA 
CCTOGGGATA 
TCX31TCA6CC 

cn'cnmtj'ix: 

TACAAATACT 



2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2620 
2680 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 



PCT/US02/12476 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



441 



wo 02/086443 

CAAGACTtSCA CCCATGG6CA GSUMCTTC TACIGCTTTC CISACMTTT CTCTCICACIi 
AATACCTCAG ASTCTCMBr TCAQCkSAn ACAOkACTTT CTACnTAAA TATTMHATC 
TTTCAATGA 



1320 
1380 



PCT/US02/12476 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: €€9 Protein sequence 
Protein Accession §: Eos sequence 



MSYQRQEPVZ 
GFFLGILLLF 
lANISYNXIA 
SLISTGLTTL 
ySSLEEPTVA 
RFCYGVTVIL 
WBUUGVJiOi 
QDCTB6QEMF 



11 

I 

PPQRDLDDSB 
WVSWTOPSL 
GDTLSKVFQR 
II/3IVMARAI 
KHSRLIHMSI 
TYPMECFVTR 
TPLIFXZPSA 
YCFPDMFSIjT 



21 

I 

TLVSEHBYKE 
VLLXKG6AJUS 
IPGVDPENVP 
SZiGPHIPKTB 
VISVFICIPP 
£VIANVFFGG 
CHiKXfSBBPR 
NTSESaVQQT 



31 
I 

KTC3QSAALPN 
GTDTyQSLVN 
IGRHFIIGIiS 
DAHVFAKPNA 
ATOGYLTFTG 
NMSVFHIW 
7BSDKIMSCV 
TQLSTLHZSX 



Seq ID NOt 670 DMA sequence 

Nucleic Acid Accession St Eos sequence 

Coding sequence s 1 . . 12 84 



ATGGGCTACC 
AAGCAAGCTG 
TTTTGCCTTG 
TTGGTCAATA 
TATCCTTTTA 
TTTCAAAGAA 
GGfkCTTTCCA 
GGAAA6GTCT 
AGGGCAATTT 
CCCAATGCCA 
TTCTTAQTTT 
ATCTCCATCO 
TTTACTGGCT 
ACATTTGGAA 
G7GACAA6AG 
ATTGTTGTAA 
CT0QG6ATA0 
CCATCA6CCT 
TCTTGTGTCA 
ACAAATACTC 
TCTCTCACAA 
ATTA6TATCT 



11 

I 

AGAGGCAGGA 
GGTTTCCTTT 
TTTTATTGAT 
AAACTTTOGG 
TAGCAATGAT 
TCCCAG6AGT 
CAGTTACCTT 
CCCTGATCTC 
CACT6QGTCC 
rrCAAGCGGT 
ACAGTrCTCT 
TGATTTCTGT 
TCACCCAAG6 
GATTTTQTTA 
AGGTAATTGC 
CAGTGAT6GT 
TTGTAGAACT 
GTTATCTGAA 
TGCTTOXAT 
AA6ACTGCAC 
ATACCTCAGA 
TTCAACTCGA 



21 

I 

GCCTGTOVTC 
aOGAATAtTa 
AAAAGGA6G6 
CTTTCCA6GG 
AAC3TTACAAT 
TGATCCTGAA 
TACTCTOOCT 
TACAGGTTTA 
ACACATACCA 
OGGGGTTATG 
AGAAGAACOC 
ATTTATCItST 
GQACTTATTT 
TGQTGTCACT 
CAATGTGTTT 
CATCACTGTA 
CAA76GT070 
ACTGTCTGAA 
TOGTGCTGTO 
CXZATGGGCAS 
GTCTCATGTT 
GTAA 



31 

1 

C06C06CAGA 
CTTTTATTCT 
G O CCTCTCTG 
TATCTGCTOC 
ATAATAGCTG 
AA06TGTTTA 
TTATCCTTGT 
ACAACTCTGA 
AAAACA6AAG 
TCTTTTGCAT 
ACAGTA6CTA 
ATATTCTTTO 
GAAAATTACT 
GTCATTTTGA 
TTTGGTG6GA 
GOCAOGCTTG 
CTCTGTGCAA 
QAACCAAGffll 
GTGATGGTTT 
GAAATGTTCT 
CAGCAGACAA 



Seq ID 190 1 671 Protein sequence 
Protein Accession 9* Sos sequence 



1 

I 

MQYQRQEFVI 
I.VNKTPGPPO 
GLSTVTFTLP 
PfflUQAVGVM 
FTGFTQGDLF 
IWTVMVITV 
SCVJ4LPIGAV 
ISIFQLB 



11 

1 

PPQRGLPYSM 
YLLLSVLQFL 
LSLYRNIAKL 
SPAFZCHBNS 
ENyCHHDDLV 
ATLVSLLIOC 
VMVFGPVMAl 



21 

1 

KQAGFPIiGIIi 
yPFIAMlSYN 
GKVSX«I8TGL 
FLVYSSXJSEP 
TFGRFCYGVT 
LQIVLEliiQV 
333TQDCTHOQ 



31 

I 

LliPWVSYVTD 
IIAGDTLSKV 
TTLILGIVMA 
TVAKNSRLIB 
VILTYPMECP 
LCATPLIFII 
WYCPPDKF 



Seq ID NO: 672 SNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence t 1..1203 



1 
I 

ATGGGCTACC 
AAAGGAGGGG 
TTTCCAGGGT 
AGTTACAATA 
GATCCTtSAAA 
ACTCTGCCTT 
ACAGGTTTAA 
CACATACCAA 
GGOGTTATGT 
QAAGAACCCA 
TTTATCTGTA 
GACTTATTTG 
GGT6TCACTG 
AATGTGTTrr 
ATCACTGTA6 
AATGGTGTGC 
CTGTCTGAAG 
GGTGCTGTGG 
CATGGGCAGQ 
TCTCATGTTC 



11 
I 

AGAGGCAGGA 
CCCTCTCTGG 
ATCTGCTCCT 
TAATAGCTGG 
A0GT5TTTAT 
TATCCTTGTA 
CAACTCTGAT 
AAACAGAAGA 
CTTTTOCATT 
CAGTAGCTAA 
TATTCrTTGC 
AAAATTACTG 
TCATTTTGAC 
TTGGTGGGAA 
OCACGCTTGT 
TCTGTGCAAC 
AACCAA6GAC 
TGATGGTTTT 
AAATGTTCTA 
AGCA6ACAAC 



21 
1 

GCCTGTCATC 
AACAGATACC 
CTCTGTTCTT 
AGATACTTTG 
TQGTCGCCAC 
C06AAATATA 
TCTTGGAATT 
OQCTTGGOTA 
TATTTGCCAC 
GTGGTCCCGC 
TACATGTGGA 
CAGAAAT6AT 
ATACCCTATQ 
TCTTTCATOG 
GTCATT6CTG 
TCCCCTCATT 
ACACTCOGAT 
TGGATTCQTC 
CTGCTTTCCT 
ACAACTTTCT 



31 
I 

COGCCGCAGT 
TACCAGTCTT 
CAGTTTTTGT 
AGCAAACTTT 
TTCATTATTG 
GCAAAGCTTG 
GTAATGGCAA 
TTTOCAAAGC 
CATAACTCCT 
CTTArCCATA 
TACTTGACAT 
GACCTGGTAA 
GAATGCTTTG 
GTTTTCCACA 
ATTGATTGOC 
TTTATCATTC 
AAOATTATGr 
ATGGCTATTA 
GACAATTTCT 
ACTTTAAATA 



41 

I 

WNSIIGSGX 
KTFGFPGYLL 
TVTPTLPLSXi 
IQAVGVMSFA 
PTQGDIiFENY 
rVWITVATL 
MLPXGAWMV 
FQ 



41 

I 

GA QGATT GCC 
GG6TTTCATA 
GAACAGATAC 
TCrCTGTTCT 
GAGATACTTT 
TTQGTCGCCA 
AOCGAAATAT 
TTCITGGAAT 
ACGCTTGGGT 
TTATTTGCCA 
AGTGGTCCC6 
CTACATQTGG 
6CAGAAATGA 
CATACCCTAT 
ATCTTTCATC 
TGTCATTGCT 
CTCCCCTCAT 
CACACTC08A 
TTGGATTOGT 
ACTGCTTTCC 
CACAACTTTC 



41 

I 

PSLVLLIKGG 
PQRIPGVDPE 
RAXSLGPHXP 
MSIVISVFIC 
V7REVIANVF 
PSACYLKLSB 
SLTNTSESKV 



41 

I 

TTTCCCTTGT 
TGOTCAATAA 
ATCCTTTTAT 
TTCAAAGAAT 
GACTTTCCAC 
GAAAGGTCTC 
OGGCAATTTC 
CCAATGCCAT 
TCTTAGTTTA 
TGTCCATCGT 
TTACTGGCTT 
CATTTGGAAG 
TGACAAGAGA 
TTGTTGTAAC 
TOGGGATAGT 
CATCAOOCTG 
CTTGTGTCAT 
CAAATACTCA 
CTCTCACAAA 
TTAGTATCTT 



51 
I 

IGU>YS^^KQA 
LSVDQPLYPP 
YRNIAKXiGKV 
FICHHNSFLV 
CRNDDLVTFG 
VSLLIDCLGI 
FQFVMAZTNT 



51 

1 

TTATTCAATG 
TGTTACAGAC 
CTACCAGTCT 
TCAGTTTTTS 
GAGCAAAGTT 
CTTCATTATT 
A6CAAAGCTT 
TGTAATGGCA 
ATTTGCAAAG 
CCATAACTCC 
CCTTATCCAT 
ATACTTGACA 
TGACCTGGTA 
GGAATGCTTT 
GGTTTTCCAC 
GATTGATTGC 
TTTTATCATT 
TAAGATTATG 
CATGGCTATT 
TGACAATTTC 
TACTTTAAAT 



51 

I 

ALSGTDTYQS 
NVFIGRHFII 
KTEDAWVFAK 
IFFATCGYLT 
PGGNZiSSVFR 
BPRTHSDKIM 
Qf^TTQLSTIiH 



51 
I 

TTTATTGATA 
AACTTTCGGC 
AGCAATGATA 
CCCAGGAGTT 
AGTTACCTTT 
CCTCATCTCT 
ACTGGGTCCA 
TCAAGCGGTC 
CAGTTCrCTA 
GATTTCTGTA 
CACCCAAGG6 
ATTTTGTTAT 
GGTAATTGOC 
AGTGATGGTC 
TCTAGAACTC 
TTATCTGAAA 
GCTTCCCATT 
AGACTGCACC 
TACCTCAGAG 
TCAACTCGAG 



60 

120 

lao 

240 
300 
360 
420 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



60 
120 
160 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



442 



wo 02/086443 



PCTAJS02/12476 



5 



Seq ID NOs 673 Protein sequence 
Protein Aceessian St Eos sequence 



1 11 21 31 41 51 

11)111 

MGYQRQEPVI PPQPSLVLLI KGGALSGTDT YQSLVNKTFG FPOYLLLSVL QPLYPFIAMI 60 

SYNIIAGDTL SKVPQRIPGV DPENVPIGRH PriGLSTVTF TI.PLSLVR£fI AKLGKVSLIS 120 

10 T6LTTLILGI VHARAXSLGP HZPXTECAHV FAKFKAIQAV 6VMSPAFXCB EKSFLVYSSL 160 

EBPTVAXNSR LZBMSZVISV FZCZPFATC6 YItTFtt3PTQ0 OltFEHYCRIR) DLVTFGRFCY 240 

GVTVILTyPH BCPVTKBVIA NVFFG(2ILSS VFHXWTVMV ZTVATLVSLL ZDCLGIVLBL 300 

NGVIiCKTPLI FIIPSACYLK I«6SEPRTSSD KIHSCVMLPZ GAWHVFGFV MAIIMTQDCT 360 
BQQBNFyCFP DHFSLINTSE S HV QQTTQLS TUTtSIFQLB 



15 

20 
25 



35 
40 
45 
50 



65 
70 
75 
80 
85 



Seq ID 80i 674 DMA sequence 

NUcleic Acid Accession #s Eos sequence 

Coding sequence: 1..1140 



1 11 21 31 41 51 

i i i I I I 

AT6GGCTACC AGAGGCAGGA GCCTGTCATC COGCC GCAGG TCAATAAAAC TTT0S6CTTT 60 

CCAGGGTATC TGCTCCTCTC TGTTCTTCAG TTTTTGTATC CTTTTATAGC AATGATAA67 120 

TACAATATAA TAGCTGGAGA TACTTTGAGC AAAGTTTTTC AAAGAATCCC AGGAGTTGAT 180 

CCTGAAAA06 TGTTTATT6G TCGCCACTTC ATTATTGGAC TTTCCACA6T TACCTTTACT 240 

CT6CCTTTAT CCTTGTAC06 AAATATAGCA AA6CTTGGAA AG6TCTCCCT CATCTCTACA 300 

GGTTTAACAA CTCTGATTCT TG6AATTGTA ATGGCAAGG6 CAATTTCACT tjGGTCCACAC 360 

ATACCAAAAA CAQAAGACOC TrGGGTATTT GCAAAGCCCA ATGCCATTCA AGOGGTOGGG 420 

GTTATGTCTT TTGCATTTAT TTGCCACCAT AACTCCTTCT TAGTTTACAO TTCTCTAOAA 480 

30 GAACCCACAG TAGCTAAGTO GTCCCX3CCTT ATCCATATGT CXJITOGTOAT TTCTGTATTT 540 

ATCTGTATAT TCTTTGCTAC ATCTGGATAC TTGACATTTA CIGQCrTCAC CCAAGGOQAC 600 

TTATTTGAAA ATTACTGCA5 AAAT6ATGAC CTGGTAACAT TTGGAAGATT TTGTTATGGT 660 

GTCACTGTCA TTTTGACATA OCCTATGGAA TGCTTTGTGA CAAGAGAGGT AATTGCCAAT 720 

GTGTTTTTTG OTOGGAATCT TTCATCGGTT TTCCACATTG TTGTAACAGT GATCGTCATC 780 

ACTGTAGCCA CGCTTGTGTC ATTG CTGATT GATTGCCTCG GGATAGTTCT AGAACTCAAT 840 

66TGTGCTCT GTGCAACTCC CXZTCATTTTT ATCATTCCAT GAQCCTSTTA TCTGAAACTG 900 

TCTGAAGAAC CAAGGACACA CTCC6ATAAG ATTATGTCTT GTGTCATGCT TCCCATTGGT 960 

GCTGTGGTGA TGQTTTTTGG ATTCX3TCATG GCTATTACAA ATACTCAAGA CTGCACCCAT 1020 

GGGCAGGAAA TGTTCTACTG CTTTCCTGAC AATTTCTCTC TCACAAATAC CTCAGAGTCT 1080 
CATGTTCA6C AGACAACACA ACTTTCTACT TTAAATATTA GTATCTTTCA ACT06AGTAA 



Seq ID KO: 675 Protein sequence 
Protein Accession ft: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MGYQRQEPVI PPQVNKTFGP PGYLLIiSVLQ FLYPFIAMIS YITIIAGDTLS KVFQRIPGVD 60 

PENVPZGRHF IIGLSTVTFT LPLSLYRNIA KLGKVSLIST QLTTLILGIV MASAI SLGPE 120 

IPKTEDAWVF AKPNAIQAVG VMSFAPICHH NSFLVYSSLE EPTVAKWSRL IHMSIVISVP 180 

ICIFFATCGY LTPTGFTQGD LFEHYCRllDD LVTFGRFCYG VTVILTYPME Ca?VTREVIAlI 240 

VFFGGNLS5V FBZWTVMVZ TVATLVSLLZ DCLGIVLELN 6VLCATPt.IP XIPSACYZiKL 300 

SEBPRTRSDK ZMSCVMLPI6 AWHVPGFVM AXmTQDCTH GQEMFYCFPD NFSLTNTSES 360 
HVQQTTQLST LNISIFQLE 

55 Seq ID KOs 676 DliA sequence 

Nucleic Acid Accession #t NM_006a53.1 
Coding sequence: 2 6.. 874 ~ 

1 11 21 31 41 51 

60 1 I 1 I I I 

AGGAATCTGC GCTOGGGTTC C6CAGATGCA GA6GTT6A6G TGGCTGCGGG ACTOGAAGTC 60 

ATCGQGCAGA GGTCTCACAG OVGCCAAGGA ACCTGGGGCC OGCTCCTCCC CCCTCCAGGC 120 

CATGAGGATT CTGCAGrTAA TCCTGCTTGC TCTGGCAACA GGGCTTGTAQ GGOGAGAGAC 180 

CAGGATCATC AAGGGGTTCG AGTGCAAGCC TCACTCCCAO CCCTGGCAGO CAGCCCTGTT 240 

OGAGAAGAOG OGGCTACTCT 0TGQG6CQAC GCTCATCGCC CCCAGATGGC TCCTCACAGC 300 

AGCCCACTOC CTCAAOCCCC GCTACATAGT TCACCTGGGG CAGCACAACC TCCAGAAGQA 360 

GGAOGGCTGT GAGCAGACCC GGACA6CCAC TGAGTCCTTC CCCCACCCCG GCTTCAACAA 420 

CA6CCTCCCC AACAAAGACC ACC6CAATGA CATCATGCTG GTGAAGATGG CAT06CCA6T 480 

CTCCATCACC T6G6CTGT6C GACCCCTCAC CCTCTCCTCA OGCTGTGTCA CTGCTGGCAC 540 

CA6CTGCCTC ATTTCCGGCT GGGGCAGCAC GTCCAQCCCC CAOTTAOGCC TGCCTCACAC 600 

CTTOCGATGC GCCAACATCA CCATCATTGA GCACCAGAAO TGTGAGAAOG CCTACCCOSG 660 

CAACATCACA GACACCATGG TGT6TGCCAG OGTGCAGGAA GGGGGCAAGG ACTCCTGCCA 720 

GGG7X3ACTCC GGOGGOCCTC 7X3GTCT6TAA CCASTCTCT7 CAAGGCATTA TCTCCTGGGG 780 

CCA6GATC06 TGTGOSATCA GCCGAAAGCC TGGTGTCTAC AC6AAAQTCT GCAAATATGT 840 

GGACTGGATC CAGGAGAOGA TOAAGAACAA TTAGACTOGA CCCACCCACC ACAGCCCATC 900 

ACCCTCCATT TCCACTTGGT GTTPGGTTCC TGTTCACTCT GTTAATAAGA AACCCTAAGC 960 

CAAGACCCTC TAQ6AACATT CTTTGGGCCT CC7GGACTAC AGGASA7GCT GTCACTTAAT 1020 

AATCAACCTG GGGTTOGAAA TCAGTGAGAC CTGGATTCAA ATTCTGOCTT GAAATATTGT 1080 

GACTCTGGGA ATGACAACAC CiWmVlT CTCIGTT6TA TCCGCAQCCC CAAAOACAGC 1140 
TCCTGGCCAT ATATCAAGGT TTCAATAAAT ATTTGCTAAA T8A8T6 



Seq ID KO: 677 Protein sequence 
Protein Accession #r NP_006844.1 

1 . 11 21 31 41 51 

1 I I I I 1 

MRIU2LIUA XATGIiVGGET RIIKOFECXP BSQFWQAALF EKTRLLGGAT LZAPRHLLTA 60 



443 



wo 02/086443 

AHCLKPRYIV HLGQHNLQKE EGCEQTRTAT ESFPHPGFNH SLPNKDHRND IMLVKMASPV 120 

SITWAVRPLT LSSRCVTAGT SCLISGWGST SSPQUILPST IiRCAITITIIB HQKCENAyPG 180 

NITDTMVCAS VQEGGKDSCQ CDSGGPLVCM QSLQGHSWO QDPCAITRKP GVYTKVCKyV 240 
DWIQBTMKKH 

Seq ZD HO: 678 DKA sequence 

nucleic Add Accession S: Eos sequence 

Coding sequence i 1 . • 933 

I 11 21 31 41 51 

ATCTGCAGCA ATGGACGGTG CATCCCGGGC GCCTGGCAGT GTGACGGGCT GCCTGACTGC 60 

TTCGACAAGA GTGATCAGAA GGAGTGCCCC AAGGCTAAGT OSAAATGTGO C OOGACCTT C 120 

TTCCCCTOTG CCAGCC3GCAT CCATTGCATC ATTGGTOGCT TCOGGTOCAA TGGGTTTGAG 180 

GACTGTCCOO ATC6CAG0GA TCAAGAOAAC TGCACAGCAA ACCCTCTGCT TTGCTCCACC 240 

GCCCGCTACC ACTCCAAGAA CGGCCTCXOT ATTGACAAGA GCTTCATCTG OGATGGACAG 300 

AATAACTCTC AAGACAACAG TGATGAGGAA AGCTGTGAAA GTTCTCAAGA ACCOGGCAGT 360 

GGGCAGGTGT TTGTX5ACTTC AGAGAACCAA CTTGTGTATT ACCCCAGCAT CACCTATGCC 420 

ATCATCGGCA GCTCC6TCAT TTTTGTGCTG GTGGTGGCCC TGCTGGCACT GGTCTTGCAC 480 

CACCAGOGGA AGCGOAACSUV CCTCATCAOO CTOCOOGTOC ACCGGCTGCA GCACCCTCTG 540 

CTGCTCTCCC GCCTCGIGGT CCTGGACCAC CXJCCACCACT GCAAC3GTCAC CTAC3UVCGTC 600 

AATAATGGCA TCCAGTATGT 6GCCA6CCA0 GCGGAGCAGA ATGOGTOGOA AGTAGGCTCC 660 

CC3VCCCTCCT ACTCCGAGGC CTTCCTGGAC CAGAGGCCTG OGTGGTATGA CCTTCCTCCA 720 

COGCCCTACT CTTCTGACAC GGAATCTCTG AACCAAOCCX3 ACCT6CCCCC CTACCGCTCC 780 

OGGTCOGGGA GTG0CAACW5 TGCCAGCTCC CAGGCAGCCA GCftGCCTCCT OAGOOTGGAA 840 

GACACCAGCC AC3W5CCCGGG GCA6CCTGGC COCCAGGAOG GCACTGCTGA OCOCAGGGAC 900 
TCTGAGCCCA GCCAGGGCAC TGAAGAAGTA TAA 

seq ID NO: . 679 Protein sequence 
Protein Accession «: Eos sequence 

1 11 21 31 41 51 

icSNGRCIPG AWQCDGLPDC FDKSDEKECP KAKSKOSPTF PPCASGIHCI IGRPRa«3FE 60 

DCPDGSDEEN CTAHPLLCST ARYHCaWGLC IDKSPICDGQ NMOQDNSDBE 8CESSQBPGS 120 

OQVFVTSBNO bVYYPSITYA IIGSSVIPVL WAIiIALVLH EQRKRNHLMT LPVHRMIHPV 180 

LLSRLWUIH PHHCHVTMW HHGIQYVASQ ABQNASBVOS PPSYSEALU) QRPAHYDLPP 240 

PPY8SDTBSL HQADLPPVRS RSGSANSASS QAASSLLSVE DTSHSPGQPG PQEtSTAEPRD 300 
SEPSQGTEEV 

Seq ID KOt 680 DKA sequence 
Nucleic Acid Accession ftt S78203.1 
Coding sequence: 1..2190 

1 11 21 31 41 51 

ATGAATCCTT TCCAGAAAAA TGAGTCCAAO GAAACTCTTT TTTCACCTGT CTCCATTGAA 60 

GRGGTAOCAC CTOGACCACC TAGCCCTCCA AAGAAGCCAT CTCCGACAAT CTGTGGCTC C 120 

AACTATOCAC TGAGCATTGC CTTCATTGTG GTGAATGAAT TCTGCGAGCG CTTrTCCTAT 180 

TATCGAATCA AAGCTGTGCT GATCCTGTAT TTCCTOTATT TCCT6CACTG GAATQAAGAT 240 

ACCTCCACAT CTATATACCA TGCCTTCAGC AGCCTCTGTT ATTTTACTCC CATOCTMSl 300 

GCAGCCATTO CTGACTOGTG GTTGGGAAAA TTCAAGACAA TCATCTATCT CTCCTTGGTG 360 

TATGTGCTTO GCCAIOTGAT CAAGTCCTTG 6GTGCCTTAC CAATACTGGG AGGACAAGTG 420 

GTACACACAO TCCTATCATT GATOOOCCTQ AGTCTAATAG CTTTGGGGAC AGGAGGCATC 480 

AAACCCTCTO TGGCAGCTTT TGGTGGAGAC CAGTTTGAAG AAAAACATOC AGAOGAACOO 540 

ACTAGATACT TCTC?U3TCTT CTACCTGTCC ATCAATGCAO G6AGCTTGAT TTCTACATTT 600 

ATCACACCCA TGCTGAGAGG AGATGTGCAA TGTTTTGGAG AAGACTGCTA TGCATTGGCT 660 

TTTGGAGTTC OWSGACTOCT CATGGTAATT GCACTTGTTG TGITPGCAAT GGGAAGCAAA 720 

ATATACAATA AACCACCCCC TGAAGGAAAC ATAGTGGCTC AAGTTTTCAA ATGTATCTGG 780 

TTTCCTATTT CCAATCGTTT CAAGAACCGT TCTGGAGACA TTOCAAAOOG ACAOCACXOG 840 

CTAGACTGGG CAGCXGAGAA ATATCCAAAG CAGCTCATTA TGGATOTAAA GGCACTGACC 900 

AGOOTACTAT TCCTTTATAT CCCATTGCCC ATGTTCTOGG CrCTTTTGGA TCAGCAGGGT 960 

TCA06ATCGA CmOCAAGC CATCAGGATG AATAGGAATT TGGGGTTTTT TGTGCTTC AG 1020 

CCGGACCAGA TGCAGGTTCT AAATCCCTTT CTGGTTCTTA TCTTCATCCC GTTGTTTGAC 1080 

TTTGTCATTT ATCGTCTGGT CTCCAAGTGT GGAATTAACT TCTCATCACT TAGGAAAATQ 1140 

GCTGTTG6TA TGATCCTAGC GTGCCTGGCA TTTGCAGTTG OGGCAGCTGT AGAGATAAAA 1200 

ATAAATOAAA TGGCCCCAGC CCAOTCACGT CCCCAGGAOG TTTTCCTACA AGTCTTGAAT 1260 

CTGGCA6ATC ATGAG6TGAA 06TGACAGT6 6TGGGAAATG AAAACAATTC TCTGTTGATA 1320 

GA6TCCATCA AATCCTTTCA GAAAACACCA CACTATTCCA AACTGCACCT GAAAACAAAA 1380 

AGCCAGGATT TTCACTTCCA CCTGAAATAT CACAATTTGT CTCTCTACAC TGAGCATTCT 1440 

GTGCAGGAGA AGAACTCGTA GAGTCTTGTC ATTC6TGAAG ATGGGAACAG TATC TCCAQC 1500 

ATOATOGTAA ACGATACAGA AAGCAAAACA ACCAATGGGA TGACAACCGT GAGGTTTGTT 1560 

6AAGACTATG GT6TGTCT6C TTATAGAACT GTGCAAAGAG GAGAATACCC TGCAGTGCAC 1680 

TCTAGAACA6 AAGATAAGAA CTTTTCTCTG AATTTGGGTC TTCTA6ACTT TCGIWAOCA 1740 

TATCTGTTTG TTATTACTAA TAACACCAAT CAGGGTCTTC AGGCCTGGAA GATTGAAGAC 1800 

ATTCCA6CCA ACAAAATGTC CATTGOOTGG CAGCTACCAC AATATGCCCT GGTTACACCT 1860 

GGG6AQGTCA TGTTCTCTOT CACAGGTCTT GAGTTTTCTr ATTCTCAGGC TCCCTCTAGC 1920 

ATGAAATCro TCCTCCAGGC AGCTTGGCTA TTGACAATTG CAGTTGGGAA TATCATOGTG 1980 

CTTGTTGTOG CACAGTTCAG TGGCCTGGTA CAGTGG6CCG AATTCATTTT GTTmPOCTOC 2040 

CTCCTGCXQG TGATCTGCCT 6ATCTTCTCC ATCATGG6CT ACTACTATGT TCCTGTAAAO 2100 

ACAGAaGAXA TOOQGGOTCC AOCAGATAAG CACATTOCTC ACATOCAGGO GAACATGATC 2160 
AAACTAGA6A CCAA6AAGAC AAAACTCTGA 

Seq ID NO I 681 Protein sequence 
Protein Accession fi: AAB34388.1 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



1 
I 

MNPFQKNESR 
YQ4KAVLILY 
yVLGHVXKSX. 
TRYFSVFYLS 
lYKKPPPEGH 
RVLPLYIPLP 
FVIYRLVSKC 
UUJOBVKVTV 
VQBRimYSLV 
BDYGVSAYRT 
IPANKMSIAW 
LWAQFSGLV 
KXiSTKKTKL 



11 
I 

ETLPSPVSIB 
FLYFIiHWNID 
GALPIU3GQV 
IHAGSLISTF 
IVAQVPKCIH 
MFKAItLDQQO 
GZNFSSLRKM 

IRED6KSISS 
VQRGEYPAVM 
QLPQYALVTA 
QHAEFILFSC 



21 
1 

BVPPRPPSPP 
TSTSIYHAFS 
VHTVLSLIGL 
ITPKXiRGDVQ 
PAISNRFKMR 
SRWTLQAIHM 
AVGMILACLA 
ESZKSFQKTP 
MMVXDTESKT 
CRTEDXMPSL 
GSVMFSVTGL 
LLLVXCLZFS 
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31 
I 

KKPSPTIOGS 
SLCYPTPILG 
SLIAIiSTGGZ 
CFBEDOCALA 
SGDXPXRQHW 
NRNLGPPVLQ 
PAVAAAVEIK 
HYSXZiKLICnC 
TSOCTTVRFV 
NLGIiLOFGAA 
BFSYSQAPSS 
ZKGYYYVPVK 



Seq ZD MOt 682 DNA sequence 

KUclelc Acid Accession 9a NM_0I6077.1 

Coding sequence: 128.. 667 



TOGCTTTGTG 
0606ATAGAA 
ACTGTAGATG 
CTTGGCTGTT 
GATGCTCXXC 
CTTGGGA6AC 
AAAAG6GAAA 
AAGAAGAAAT 
CAAAGCTCCT 
QACTGTAAGT 
GCTAGGGATT 
TTACTAGGTO 
GATTCTAACA 
AAACCTATTC 



11 
I 

ATTCTTGATC 
A0GT6TT0GC 
CCCTCCAAAT 
GGAGTTGCTT 
AAAAGCAAGA 
AG06GGGA6T 
GTGGCTGCCC 
CCTGAAATGC 
GATGAAGAAA 
TTAATTCAAG 
G6G0CAGGAC 
GACTTTGATA 
ACAAAAGCTG 
CCATQTTCTA 



21 
I 

CGGAACTTTG 
TTOCCCAGAA 
CCTTGGTTAT 
GTGGCATGTG 
CGAGCAAGAC 
ACAAGATGAT 
AGTGCTCTCA 
TCAAACAATQ 
CCCTGATTGC 
ATGCTGGAOG 
CAGCASACCr 
TQACAACAAC 
AATTTCTTCA 



31 
I 

TCACCCAGGA 
GAAGGGAAGG 
GGAATATTTG 
CCTGGGCTGG 
ACACACAGAT 
TCTTGTGGTT 
T6CTGCTGTT 
GGAATACTGT 
ATTATTGGCC 
TACTCAGATT 
AAITGACAAA 
COCTCCATCA 
CGCAACTTAA 



41 
I 

NYPLSZAFIV 
AAZADSHLGK 
KPCVAAFGGD 
FGVPGUMVI 
IiDKAABKYPK 
PDQMQVWF 
INEMAPAQSG 
SQDPHFBLKY 
NTLBXDVKZS 
YLFVZTNimi 
MKSVXiQAAML 
TEDMRGPADX 



41 

I 

ACCCOGGAAG 
OGOGAGTGAG 
GCTCATCCCA 
AG0CTT06A0 
ACTGAAAGT6 
06AAATGACT 
TCAGCCTACA 
GGCCAGCCCA 
CATGCAAAAA 
6CACCAGGCT 
6TCACTGGTC 
CAAGTGTTTG 
AT6TTCTTGA 



51 
I 

VKEFCERPSY 
FKTIIYLSLV 



ALVVFAM6SX 
QLZKDVKALT 
LVLIFIPLFD 
PQBVFLQVLN 
HNLSLYTEBS 
LSTDTSLNVO 
QGLQAHKIBD 
LTZAV6HZZV 
HZPBZQ6BIMZ 



51 
I 

AOGTAGCTCA 
GAAAGGAGGT 
GTACACTOGG 
TATGCTTTGG 
AAGCAAGCAT 
TAAAGATG6G 
AGCAGATTCA 
AGGTGGTGQT 
TGCTGGGACT 
CTCAAACTGT 
ACCTAAAACT 
AAGCCTGTCA 
GATGAAAATA 



Seq ZD KOt 683 Protein sequence 
Protein Accession ft: NP_057161.1 



51 



1 11 21 31 41 

1 I I I 1 I 

HPSKSLVMBY lAHPSTLGLA VGVAO(»CLG HJSLRVCPGNL FKSKTSKTBT DTBSBASILG 
DSGSYXNZLV VRNDLKMGKG KVAAQCSKAA VSAYKOIORR NPEKLKQWEY CGQPXVWKA 
POEETLIALL AKAKMLGLTV SLIQDAGRTQ ZAPGSQTVLO ZGPGPADLZD RVTGBLKLY 

Seq ZD KOi 684 DNA sequence 

Nucleic Acid Accession 9> NM_004864.1 

Coding sequence: 2 6.. 952 



1 

I 

CQGAA06AGG 
TCAGAT6CTC 
G6CCGAGGCG 
ATTCCGAGAG 
CTGGGAAGAT 
AGTGOQGCIG 
G6GGCTCCCC 
AAGGTCGTGG 
GCCCGCGCTG 
ATCTTCGTOC 
COGCAQAGOQ 
TCTGCACACS 
A0GGGAGGTX3 
CATGCAOGCQ 
CTGCTGCBTG 
OTCGCTCCAQ 
GGTCCTTCCA 
GGGCTCAAGG 
TTATTTATTA 
ACT6TGTATT 
AAAA 



11 
I 

GCAACCIGCA 
CTGOTGTTGC 
A6CCGCGCAA 
TTGOQGAAAC 
TCGAACACCG 
GGATCCGGOQ 
GAG6CCT0CC 
GACGTGACAC 
CACCTGCGAC 
GCACGGCOOC 
OGTGCSOOCA 
0TC0SCG06T 
CAAGTGAOCA 
CAGATCAAGA 
CCC6CCA6CT 
ACCTAT6ATG 
CTGTGCAOCT 
TTCCTGAGAC 
TTAATTTATT 
TATTTAAAAC 



21 
I 

CAGCCATGCC 
TGGTGCTCTC 
GTTTCXX3GGQ 
GCTACGAGGA 
ACCTCGTCCC 
6CCACCTGCA 
GCCTTCAC06 
GACOGCTGOG 
TGTOGCOGCC 
AGCTOGAGTT 
AC6GGGACGA 
CQCTG6AAGA 
T6TGCAT0GG 
OGAGCCTGCA 
ACAATCCCAT 
ACTTGTTA6C 
G0G0GG6GGA 
ACCCGATTCC 
GGGGTGACCT 
TCTGGTGATA 



31 

I 

CGOGCAAOAA 
GTGGCTGC03 
ACCCTGAGAG 
CC7GCTAACC 
GGCCCCTGCA 
CC7GGQTATC 
G6CTCTGTTC 
GCGTCAGCTC 
GCCGTG6CAG 
GCACTT6CGG 
CTGTOCGCTC 
CCTGQGCTGQ 
0GCGTQCC08 
CC6CC7GAAG 
GGTX3CTCATT 
CAAAGACTGC 
GGOGACCrCA 
TGCCCAAACA 
TCTTOGOGAC 
AAAATAAA8C 



41 

I • 

CTCAGGACOG 
CATGGGGGOG 
TTGCACTCCG 
AGGCTGCGGG 
GTCOGGATAC 
TCT06GGC0S 
CQGCT6TCCC 
AGCCTTGOUV 
TCGGACCAAC 
COGCAAGCC6 
GQGCCOGGGC 
OC0QATTG06 
AGCCAGTTCC 
CCCGACACG6 
CAAAAGACOS 
CACTGCATAT 
GTTGTCCTGC 
OCTGTATTTA 
TCGGGGOCTG 
TGTCT6AACT 



51 

I 

TGAATGQCrC 
CCCTGTCTCT 
AAGACTCCAG 
CCAftCCAGAG 
TCAOGOCAGA 
CGCTTCCOGA 
0QA0GG06TC 
6ACCCCAA6C 
TGCTGGCAGA 
CCAGGGGGOG 
GTTGCTGCCO 
TGCTGT06CC 
GGGOGGCAAA 
AGCCAGOSCC 
ACACCGGGGT 
6A0CA0TCCT 
GCTGTGGAAT 
TATAAGTCTG 
GTCTGATGGA 
GTTAAAAAAA 



Seq ZD NO: 685 Protein sequence 
Protein Accession «: NP_004855.1 

1 11 21 31 

i I I t 

KFGQSLRTVK GSQMLLVLLV LSWLPUGGAL SLABASRASF 
EDUiTRLRAN QSWBDSNTDL VPAPAVRILT PBVRIiGSQGH 
BRALFRLSPT ASRSHOVTRP LRRQZiSLARP QAPAUOALS 
EimAPQAAR GRRRARABNG DDCPLGPGRC CRXOTVRASIi 
IGACPSQFRA ANMUAQZKTS XiHRLKFDTEP APCCVPASYN 
LAKDCHCZ 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



41 51 

I I 
PGP5ELHSED SRFRELRKRY 
LHI£ISRAAL PEGLPEASRL 
PPPSQSDQLL AESSSARPQL 
EDLGKADHVL SFREVQVTKC 
PMVLZQXTDT GVSIiQTYDDL 



60 
120 
180 
240 
300 



Seq ZD NO: 686 DNA sequence 
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Nucleic Acid Accession St MM_003423.2 
Coding sequeneet 48..B51 *~ 

- 1 11 21 31 41 51 

3 I ) I I I I 

AOCAAATCAA CCATAG6TCC AAGAACAATT GTCTCTG6AC OGCAGCtATG CC3ACTCACCG 60 

TGCTOTGTGC TGTGTGCCTG CTGCCTGGCA GOCTGGCCCT GCCGCTGCCT CAGGAGGOGG 120* 

GAGGCATGAG TGAGCTACAG TGGGAACAGG CTCAGGACTA TCTCAAGAGA TTTrATCTCT 180 

^ ATGACTCAGA AACAAAAAAT GCCAACAGTT TAGAAGCCAA ACTCAA06AG AXGOVAAAAT 240 

10 TCTTTGGCCT ACCTATAACT GGAATGTZAA ACTOOCQGGT CATAOAAATA AIGCAGAAOC 300 

GQ\GATGTG6 AGTGCCAGAT 6TTGCA6AAT ACTCACTATT TOCAAATAGC CCAAAATGGA 360 

CTTCCAAAGT GGTCAOCTAC AGGATOGTAT CATATACTOG AGACTTACCG CATATTACAG 420 

TGGATCGATT AGT6TCAAAG GCTTTAAACA TOTGGGGCAA AGAGATCCCC CTGCATTTCA 480 

^ GGAAAGTTQT ATGGGGAACT 6CTGACATCA TGATTGGCTT TQOGOGAGGA GCTCA7X3GGG 540 

15 ACTGCIACCC ATTTGATGGO CCAGGAAACA OBCTSG CT CA TG0CTTTG08 CCTGGOACAG 600 

GTCTOGGAGG AGATGCTGAC TT0GATGAG6 ATGAAOGCTG GA0GGAT6GT AGCAGTCTAO 660 

GGATTAACTT CCTGTATGCT GCAACTCATG AACTTGGCCA TTCTTTGGGT ATGGGACATT 720 

CCTCTQATCC TAATGCAGTG ATGTATCCAA CCTATGGAAA TGGAGATOCC CAAAATTTTA 780 

AACTTTCCCA GGATOATATT AAAOGCATTC AGAAACTAtA TGGAAA6AGA AGTAATTCAA 640 

GAAAGAAATA GAAACTTCA6 GCAGAACATC CATTCATTC» TTCATTCGAT TGTATATCAT 500 

TxnrrccACAA TCAGAATTGA TAACCACTGT TCCTCCACTC CATTTAGCAA TTATGTCACC 960 

CTTTTTTATT GC31GTTGGTT TTTGAATGTC TTTCACTCCT TTTATTGGTT AAACTCCTTT 1020 

ATGGTGTGAC TGTGTCTTAT TCCATCTATG AOCTTTGTCA GTGCGOQTAG AT6TCAATAA 10 BO 
ATGTTACATA CACAAATAAA TAAAATGTTT ATTCCATGGT AAATTTA 



20 
25 



40 
45 
50 
55 
60 



Seq XD HO: 687 Protein sequence 
Protein Accession #i KP_002414.1 



1 11 21 31 41 51 

30 1 1 ) I I I 

MRLTVIiCAVC LLPGSIALPIi PQEAGGMSEL QHBQAQDYLR RPYIjYDSETR NANSLEAKLK 60 
EMQKFFGIiPI TQ4LNSRVIE IMQXFR06VP DVAEYSLFPH SPXHTSKWT YRIVSYTRDL 120 
PHZTVDSLV8 KAUnCWGKBI PLHFRKWHG TADIHIGFAR GAHGDSYPFD GPGNTLAHAP 160 
APGTCLQGDA HFDEDERWTD GSSUSUfFlX AATBEUSBSIi GM6HSSDPNA VVXPTYGHGD 2 AO 
33 FQMFKLSQOD ZKQIQXZiYQK R5NSRKR 



Seq ID MO: 668 DKA sequence 

Nucleic Acid Accession St MM_005221.3 

Coding sequence: 1..870 ' 

1 11 21 31 41 51 

I I I t I I 

ATGACAGGAG 7GTTTGACA0 AAGOOTCCCC AGCATCCGAT CCGGG8ACTT CCAAGCTC08 60 

TTCCAGAOGT CCGCAGCTAT GCACCATCCO TCTCAGGAAT OGCCAACTTT GCCOGAGTCT 120 

TCAGCTACCG ATTCTGACTA CTACAGCCCT ACGGGGGGAG CCCCGCACGG CTACTGCTCT 180 

CCTACCTCX3G CTTCCTATGG CAAA6CTCTC AACCCCTACC AGTATCAGTA TCACGGOGTG 240 

AAC6GCTCGQ COGQGAGCTA COCAGCCAAA GCTTATGCCG ACTATAGCTA CGCTAQCTCC 300 

TACCACCAOT AOGGCG606C CTACAACG6C GTCCCAA60G CCACCAACCA GCCAQAOAAA 360 

GAAGHGACCG AGCCOGAGGT GAGAATGGTO AATX3GCAAAC CAAAGAAAGT TCGTAAACCC 420 

AGGACTATTT ATTCGAGCTT TCAGCTGGCC GCATTACAGA GAAGGTTTCA GAAGACTCAG 480 

TACCrOGCCT T6CC6GAACG CGCOQAGCTG GCCGCCTC6C TGGGATTGAC ACAAACACAG 540 

GTGAAAATCT GGTTTCA6AA CAAAAGATCC AAGATCAAGA AGATCATQAA AAACGGQGAG 600 

ATGCCCCCQG AGCACAGTCC CAGCTCCA6C GACCCAATGG OGTGTAACTC GCCOGAGTCT 660 

CCAGOGGTGT GGGAGCOCCA GGGCTCGTCC CGCTOGCTCA GCCACCACCC TCATGCCCAC 720 

CCTCOGACCT CCAACCAGTC CCCAOCOTCC AGCTACCTGG AGAACTCTGC ATCCTGGTAC 780 

ACAAGTGCAG CCAGCTCAAT CAATTCCCAC CTG006CCGC OGGGCTCCTT ACAGCACCOG 840. 
C TG GO SCTOG CCTCC9G6AC ACTCTATTAG 

Seq ZD KO: 689 Protein sequence 
Protein Accession S: Np 005212.1 



1 11 21 31 41 51 

1111(1 

KT6VPDRRVP SIRSGOFQAP FQTSAAMKHP SQESPTIiPES SATDSDYYSP TGGAFEGYCS 60 

65 PTSASYGKAL NPyQYQVHGV NGSAGSVPAK AYADYSYASS YHQYGGAYNR VPSATNQPBK 120 

EVTBPEVRNV MGXPXICVRXP RTZYSSFQIiA ALQRRFQKTQ YXALPERAEL AASI^TQTQ 180 

VKIHFC^IKRS KIKKZtlXIIGB MPPEHSPSSS DPKACNSPQS PAVHBPQ68S RSLSHHFHAB 240 
PPTSISIQSPAS SYLENSASITY TSAASSZMSH LPPPGSLQpEIP LAIASGTLY 
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It is understood that the examples described above in no way serve to limit the true 
scope of this invention, but rather are presented for illustrative purposes. All publications, 
sequences of accession numbers, and patent ^plications cited in this specification are herein 
ncorporated by reference as if each individual publication or patent application were 
specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMED IS: 

1 LA method of detecting a lung cancer-associated transcript in a cell 

2 &om a patient, the method comprising contacting a biological sample from &e patient with a 

3 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

4 as shown in Tables lA-16. 

1 2. The method of claim 1, wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1A*16. 

1 3. The method of claim 1, wherein the biological sample is a tissue 

2 sample. 

1 4. The me&od of claim 1 , wherein the biological sample comprises 

2 isolated nucleic acids. 

1 5. The method ofclaim 4, wherein the nucleic acids are mRNA. 

1 6. The method ofclaim 4, further comprising the stq) of amplifying 

2 nucleic acids before the stq) of contacting the biological sample with the polynucleotide. 

1 7. The method ofclaim 1, wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1 A-16. 

1 8. Tlie method of claim 1 , wherein the polynucleotide is labeled. 

1 9. The method of claim 8, wherein the label is a fluorescent label. 

1 10. The method of claim 1 , wherein the polynucleotide is immobilized on 

2 a solid sur&ce. 

1 II. The method of claim 1 , wherein the patient is undergoing a therapeutic 

2 regimen to treat lung cancer. 

1 12. The method of claim 1, wherein the patient is suspected of having lung 

2 cancer. 

1 1 3. A method of monitoring the eflBoacy of a ther^utic treatment of lung 

2 cancer, the method comprising the steps of: 

448 
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3 (i) providing a biological sample &om a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a lung cancer-associated transcript in the 

6 biological sample by contacting the biological sample with a polynucleotide that selectively 

7 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1 A-16, 

8 thereby monitoring the efficacy of the therapy, 

1 14. Hie method of claim 13, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated transcript to a level of the lung cancer-associated 

3 transcript in a biological sample fiom the patient prior to» or earlier in, the ther^eutic 

4 treatment. ... 

1 IS. The method of claim 13, wherein the patient is a human. 

1 16. A method of monitoring the efi&cacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatmrat; and 

5 (ii) determining the level of a lung cancer-associated antibody in the biological 

6 sample by contacting the biological sample with a polypq)tide encoded by a polynucleotide 

7 that selectively hybridizes to a sequmce at least 80% idmtical to a sequence as shown in 

8 Tables 1 A-16, wh^in the polypeptide specifically binds to ttie lung cancer-associated 

9 antibody, thereby monitoring the efficacy of the therapy. 



1 17. The method of claim 16, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated antibody to a level of the lung cancer-associated 

3 antibody in a biological sample fi-om the patimt prior to, or earlier in, the therapeutic 

4 treatment. 

1 18. The method of claim 1 6, wherein the patient is a human. 

1 1 9. A method of monitoring the efiScacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

3 (i) providing a biological sample Scorn a patient undergoing the therapeutic 

4 treatment; and 
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5 (ii) determining tiie level of a lung cancer-associated polypeptide in the 

6 biological sample by contacting the biological sample with an antibody, wherein the antibody 

7 specifically bmds to a polypeptide encoded by a polynucleotide that selectively hybridizes to 
g a sequence at least 80% identical to a sequence as shown in Tables lA-16, thereby 

9 monitoring the efficacy of the ther^y. 

1 20. The method of claim 19, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated polypeptide to a level of the lung cancer-associated 

3 polypeptide in a biological san:q)le &om the patient prior to, or earlier in, the ttierapeutic 

4 treatmmt. 

21. The method of claun 19, wherein the patient is a human. 

22. An isolated nucleic acid molecule consisting of a polynucleotide 
sequence as shown in Tables 1 A-16. 

23. The nucleic acid molecule of claim 22, which is labeled. 

24. The nucleic acid of claim 23, wherein the label is a fluorescent label 

25. An expression vector comprising the nucleic acid of claim 22. 

26. A host cell comprising the e3q)ression vector of claim 25 . 

27. An isolated polypeptide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown m Tables lA-16. 

28. An antibody that specifically binds a polypeptide of claim 27. 

29. The antibody of claim 28, further conjugated to an effector component. 

30. The antibody of claim 29, wherein the effector component is a 
fluorescent label. 

3 1 . The antibody of claim 29, wherein the effector component is a 
radioisotope or a cytotoxic chemical. 

32. The antibody ofclaim 29, which is an antibody firagment 
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1 33. The antibody ofclaim 29, which is a humanized antibody 



1 34. A method of detecting a lung cancer cell in a biological sample from a 

2 patirat, the method comprising contacting the biological sample with an antibody of claim 

3 28. 

1 35. The method of claim 34, wherein the antibody is further conjugated to 

2 an effector component. 

1 36. The method of claim 35, wherem the effector component is a 

2 fluorescent label. 

1 37. A method of detecting antibodies specific to lung cancer in a patient, 

2 die method conq)rising contacting a biological ssnaplc from the patient with a polypeptide 

3 encoded by a nucleic acid comprises a sequence from Tables 1 A-16. 

1 38. A method for idratifying a compound that modulates a lung cancer- 

2 associated polypeptide, the method comprising the stqis of: 

3 (i) contacting the compound with a lung cancer-associated polypeptide, the 

4 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 

5 80% identical to a sequence as shown in Tables 1 A-16; and 

6 (ii) determining the functional effect of the compound upon the polypeptide. 

1 39. The method of claim 38, wherein the functional effect is a physical 

2 effect. 

1 40. The method of claim 38, wherein the functional effect is a chemical 

2 effect. 

1 41 . The method of claim 38, wherein the polypeptide is expressed in a 

2 eukaryotic host cell or cell membrane. 

1 42. The method of claim 38, wherein the functional effect is determined by 

2 measuring ligand binding to the polypeptide. 

1 43. The method of claim 38, wherein the polypeptide is recombinant. 
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1 44. A method of inhibiting proliferation of a lung cancer-associated cell to 

2 treat lung cancer in a patient, the method comprising the step of administering to the subject a 

3 therapeutically effective amount of a compound identified using the method of claim 38. 

1 45. The method of claim 44, wherein the compound is an antibody. 

1 46. The method ofclaim 45, wherein the patient is a human. 

1 47. A drug screening assay comprising the steps of 

2 (i) administering a test compound to a mammal having lung cancer or a cell 

3 isolated therefiom; 

4 (ii) comparing the level of gene expression of a polynucleotide that selectively 

5 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables lA-16 in a 

6 treated cell or mammal with the level of gme expression of the polynucleotide in a control 

7 cell or mammal, wherein a test compound that modulates the level of expression of the 

8 polynucleotide is a candidate for the treatment of lung cancer. 

1 48. The assay of claim 47, wherein the control is a mammal with lung 

2 cancer or a cell therefrom that has not been treated with the test compound. 

1 49. The assay of claim 47, wherein the control is a normal cell or mammal. 

1 50. A method for treating a mammal havmg lung cancer comprising 

2 administering a compound identified by the assay of claim 47. 

1 5 1 . A pharmaceutiPcal composition for treating a mammal having lung 

2 cancer, the composition comprising a compound identified by the assay of claim 47 and a 

3 physiologically acceptable excipient. 



452 



